Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial not work #509

Open
galoisking opened this issue Jul 21, 2021 · 1 comment
Open

Tutorial not work #509

galoisking opened this issue Jul 21, 2021 · 1 comment

Comments

@galoisking
Copy link

follow https://reagent.ai/rasp_tutorial.html#installing-reagent ,

./reagent/workflow/cli.py run reagent.workflow.training.identify_and_train_network "$CONFIG"

/home/circleci/project/ReAgent/reagent/preprocessing/preprocessor.py:120: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
input.shape == input_presence_byte.shape
/home/circleci/project/ReAgent/reagent/preprocessing/preprocessor.py:589: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
elif max_value.item() > MAX_FEATURE_VALUE:
/home/circleci/project/ReAgent/reagent/preprocessing/preprocessor.py:594: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
elif min_value.item() < MIN_FEATURE_VALUE:
I0721 100356.023 preprocessor.py:37] CUDA availability: False
I0721 100356.023 preprocessor.py:45] NOT Using GPU: GPU not requested or not available.
/home/circleci/project/ReAgent/reagent/prediction/predictor_wrapper.py:193: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert q_values.shape[1] == 2, f"{q_values.shape}"
I0721 100356.088 training.py:269] Saved default_model to DiscreteDQN_default_model_1626861836.torchscript
I0721 100356.090 training.py:269] Saved binary_difference_scorer to DiscreteDQN_binary_difference_scorer_1626861836.torchscript

(base) circleci@e79b99c2c4f9:/project/ReAgent$ mkdir -p /tmp/0
(base) circleci@e79b99c2c4f9:
/project/ReAgent$ cp model_.torchscript /tmp/0/0

(base) circleci@e79b99c2c4f9:~/project/ReAgent$ python serving/examples/ecommerce/customer_simulator.py contextual_bandit.json
0
200
400
600
800
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/circleci/miniconda3/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/circleci/miniconda3/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "serving/examples/ecommerce/customer_simulator.py", line 49, in serve_customer
result = post(
File "serving/examples/ecommerce/customer_simulator.py", line 24, in post
response = urllib.request.urlopen(req, jsondataasbytes)
File "/home/circleci/miniconda3/lib/python3.8/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/home/circleci/miniconda3/lib/python3.8/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/home/circleci/miniconda3/lib/python3.8/urllib/request.py", line 542, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/home/circleci/miniconda3/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/home/circleci/miniconda3/lib/python3.8/urllib/request.py", line 1379, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/home/circleci/miniconda3/lib/python3.8/urllib/request.py", line 1354, in do_open
r = h.getresponse()
File "/home/circleci/miniconda3/lib/python3.8/http/client.py", line 1347, in getresponse
response.begin()
File "/home/circleci/miniconda3/lib/python3.8/http/client.py", line 307, in begin
version, status, reason = self._read_status()
File "/home/circleci/miniconda3/lib/python3.8/http/client.py", line 276, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "serving/examples/ecommerce/customer_simulator.py", line 83, in
results: List[Tuple[str, float]] = p.map(serve_customer, list(range(EPOCHS)))
File "/home/circleci/miniconda3/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/circleci/miniconda3/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
http.client.RemoteDisconnected: Remote end closed connection without response
[1]+ Aborted (core dumped) nohup ./serving/build/RaspCli --logtostderr > cli.log

(base) circleci@e79b99c2c4f9:~/project/ReAgent$ cat cli.log
I0721 10:05:05.707381 9778 DiskConfigProvider.cpp:9] READING CONFIGS FROM serving/examples/ecommerce/plans
I0721 10:05:05.707865 9778 DiskConfigProvider.cpp:48] GOT CONFIG contextual_bandit.json AT serving/examples/ecommerce/plans/contextual_bandit.json
I0721 10:05:05.707962 9778 DiskConfigProvider.cpp:52] Registered decision config: contextual_bandit.json
I0721 10:05:05.708199 9778 DiskConfigProvider.cpp:48] GOT CONFIG heuristic.json AT serving/examples/ecommerce/plans/heuristic.json
I0721 10:05:05.708250 9778 DiskConfigProvider.cpp:52] Registered decision config: heuristic.json
I0721 10:05:05.708446 9778 DiskConfigProvider.cpp:48] GOT CONFIG multi_armed_bandit.json AT serving/examples/ecommerce/plans/multi_armed_bandit.json
I0721 10:05:05.708492 9778 DiskConfigProvider.cpp:52] Registered decision config: multi_armed_bandit.json
I0721 10:05:05.708657 9787 Server.cpp:58] STARTING SERVER
[F PytorchActionValueScorer.cpp:74] TORCH ERROR: forward() Expected a value of type 'torch.reagent.core.types.ServingFeatureData' for argument 'state' but instead found type 'Tuple[Tensor, Tensor]'.
Position: 1
Declaration: forward(torch.reagent.prediction.predictor_wrapper.DiscreteDqnPredictorWrapper self, torch.reagent.core.types.ServingFeatureData state) -> ((str[], Tensor))
Exception raised from checkArg at ../aten/src/ATen/core/function_schema_inl.h:162 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6b (0x7ff39a7067eb in /home/circleci/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0xce (0x7ff39a70246e in /home/circleci/libtorch/lib/libc10.so)
frame #2: + 0x10194a2 (0x7ff385e5d4a2 in /home/circleci/libtorch/lib/libtorch_cpu.so)
frame #3: + 0x101d731 (0x7ff385e61731 in /home/circleci/libtorch/lib/libtorch_cpu.so)
frame #4: torch::jit::GraphFunction::operator()(std::vector<c10::IValue, std::allocatorc10::IValue >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, c10::IValue> > > const&) + 0x2d (0x7ff388703e3d in /home/circleci/libtorch/lib/libtorch_cpu.so)
frame #5: torch::jit::Method::operator()(std::vector<c10::IValue, std::allocatorc10::IValue >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, c10::IValue> > > const&) + 0x161 (0x7ff388713eb1 in /home/circleci/libtorch/lib/libtorch_cpu.so)
frame #6: torch::jit::Module::forward(std::vector<c10::IValue, std::allocatorc10::IValue >) + 0x10c (0x7ff399f4540a in ./serving/build/RaspCli)
frame #7: reagent::PytorchActionValueScorer::predict[abi:cxx11](reagent::DecisionRequest const&, int, int) + 0x927 (0x7ff399f413ff in ./serving/build/RaspCli)
frame #8: reagent::ActionValueScoring::runInternal[abi:cxx11](int, int, reagent::DecisionRequest const&) + 0x5c (0x7ff39a28af52 in ./serving/build/RaspCli)
frame #9: reagent::ActionValueScoring::run(reagent::DecisionRequest const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::variant<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >, std::vector<long, std::allocator >, std::vector<double, std::allocator >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, long, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, long> > >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, double, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, double> > >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, double, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, double> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, double, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, double> > > > > >, std::vector<reagent::ActionDetails, std::allocatorreagent::ActionDetails > >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::variant<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >, std::vector<long, std::allocator >, std::vector<double, std::allocator >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, long, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, long> > >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, double, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, double> > >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, double, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, double> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, double, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, double> > > > > >, std::vector<reagent::ActionDetails, std::allocatorreagent::ActionDetails > > > > > const&) + 0x133 (0x7ff39a28ae39 in ./serving/build/RaspCli)
frame #10: + 0xd84fe6 (0x7ff39a24efe6 in ./serving/build/RaspCli)
frame #11: + 0xd871a0 (0x7ff39a2511a0 in ./serving/build/RaspCli)
frame #12: std::function<void ()>::operator()() const + 0x32 (0x7ff39a25bce2 in ./serving/build/RaspCli)
frame #13: void std::__invoke_impl<void, std::function<void ()>&>(std::__invoke_other, std::function<void ()>&) + 0x20 (0x7ff39a258da8 in ./serving/build/RaspCli)
frame #14: std::__invoke_result<std::function<void ()>&>::type std::__invoke<std::function<void ()>&>(std::function<void ()>&) + 0x26 (0x7ff39a256723 in ./serving/build/RaspCli)
frame #15: std::invoke_result<std::function<void ()>&>::type std::invoke<std::function<void ()>&>(std::function<void ()>&) + 0x20 (0x7ff39a254c2d in ./serving/build/RaspCli)
frame #16: tf::Executor::_invoke_static_work(unsigned int, tf::Node*) + 0xf3 (0x7ff39a27ef37 in ./serving/build/RaspCli)
frame #17: tf::Executor::_invoke(unsigned int, tf::Node*) + 0x11b (0x7ff39a27e8ef in ./serving/build/RaspCli)
frame #18: tf::Executor::_exploit_task(unsigned int, std::optionaltf::Node*&) + 0x12e (0x7ff39a27e036 in ./serving/build/RaspCli)
frame #19: tf::Executor::_spawn(unsigned int)::{lambda()#1}::operator()() const + 0x78 (0x7ff39a27dbba in ./serving/build/RaspCli)
frame #20: void std::__invoke_impl<void, tf::Executor::_spawn(unsigned int)::{lambda()#1}>(std::__invoke_other, tf::Executor::_spawn(unsigned int)::{lambda()#1}&&) + 0x20 (0x7ff39a283f02 in ./serving/build/RaspCli)
frame #21: std::__invoke_result<tf::Executor::_spawn(unsigned int)::{lambda()#1}>::type std::__invoke<tf::Executor::_spawn(unsigned int)::{lambda()#1}>(std::__invoke_result&&, (tf::Executor::_spawn(unsigned int)::{lambda()#1}&&)...) + 0x26 (0x7ff39a283233 in ./serving/build/RaspCli)
frame #22: decltype (__invoke((_S_declval<0ul>)())) std::thread::_Invoker<std::tuple<tf::Executor::_spawn(unsigned int)::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) + 0x28 (0x7ff39a285528 in ./serving/build/RaspCli)
frame #23: std::thread::_Invoker<std::tuple<tf::Executor::_spawn(unsigned int)::{lambda()#1}> >::operator()() + 0x1d (0x7ff39a28548f in ./serving/build/RaspCli)
frame #24: std::thread::_State_impl<std::thread::_Invoker<std::tuple<tf::Executor::_spawn(unsigned int)::{lambda()#1}> > >::_M_run() + 0x1c (0x7ff39a28542e in ./serving/build/RaspCli)
frame #25: + 0xc819d (0x7ff39941b19d in /home/circleci/miniconda/lib/libstdc++.so.6)
frame #26: + 0x76db (0x7ff384c2c6db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #27: clone + 0x3f (0x7ff3843b288f in /lib/x86_64-linux-gnu/libc.so.6)

@rangi513
Copy link

I'm getting the same error running through the tutorial. When I get to the customer_simulator.py step and it goes to post to RASP and score the prediction, it prints this error in the logs:

[F PytorchActionValueScorer.cpp:75] TORCH ERROR: forward() Expected a value of type '__torch__.reagent.core.types.ServingFeatureData' for argument 'state' but instead found type 'Tuple[Tensor, Tensor]'.
Position: 1
Declaration: forward(__torch__.reagent.prediction.predictor_wrapper.DiscreteDqnPredictorWrapper self, __torch__.reagent.core.types.ServingFeatureData state) -> ((str[], Tensor))
Exception raised from checkArg at ../aten/src/ATen/core/function_schema_inl.h:162 (most recent call first)

I've traced the error down to model.forward(inputs) here: https://github.com/facebookresearch/ReAgent/blob/master/serving/reagent/serving/core/PytorchActionValueScorer.cpp#L50
Maybe the request for the state features in the example needs to be changed somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants