Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running st_obj.fit() #3

Open
adoe21 opened this issue Aug 21, 2024 · 5 comments
Open

Error when running st_obj.fit() #3

adoe21 opened this issue Aug 21, 2024 · 5 comments

Comments

@adoe21
Copy link

adoe21 commented Aug 21, 2024

Hello,

I am trying to run the code in your vignette and am running into an error. When running the st_obj.fit step I get an error both when I use the example data and when I use the test data. The error is this:

st_obj = SHARE_topic(test_atac, test_rna, n_topics, alpha, beta, gamma, tau)
theta, lam, phi = st_obj.fit(batch_size, n_samples, n_burnin, dev=device, save_data=False, path="")

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/groups/CEDAR/epiconfig/comparison/shareTopic/SHARE-Topic/SHARE_topic/SHARE_topic.py", line 294, in fit
    t,atac_cell_batches , theta_tmp, phi_tmp, lam_tmp,device)
    ^
UnboundLocalError: cannot access local variable 't' where it is not associated with a value

test_atac and test_rna are the example data that have been loaded in using sc.read_h5ad()

Any help on solving this issue would be very appreciated thank you!

@Nour899
Copy link
Owner

Nour899 commented Aug 28, 2024

Hello,
Yes indeed you are right!
I was creating t tensor in case of GPU only.
I corrected it now for cpu users.
I do advice to run the code on a GPU if available because it will be slow otherwise.
Thanks for the comment and let me know if it works now for you.

@adoe21
Copy link
Author

adoe21 commented Sep 9, 2024

Hello,
This worked but then another error appeared later in the pipeline that I think might be due to a similar issue. When running st_obj.waic() I got this error:

Traceback (most recent call last):
  File "/home/groups/CEDAR/epiconfig/comparison/shareTopic/SHARE-Topic/PBMC_shareTopic.py", line 44, in <module>
    waic = st_obj.WAIC(batch_size, theta, lam, phi, device)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/groups/CEDAR/epiconfig/comparison/shareTopic/SHARE-Topic/SHARE_topic/SHARE_topic.py", line 412, in WAIC
    WAIC_rna = self.WAIC_RNA (batch_size, theta, lam, device)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/groups/CEDAR/epiconfig/comparison/shareTopic/SHARE-Topic/SHARE_topic/SHARE_topic.py", line 390, in WAIC_RNA
    theta_tmp=theta_tmp.type(torch.cuda.DoubleTensor)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason.  The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,--no-as-needed in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols.  You can check if this has occurred by using ldd on your binary to see if there is a dependency on *_cuda.so library.

@Nour899
Copy link
Owner

Nour899 commented Sep 14, 2024

Yes indeed it is the for the same issue.
Thanks and let me know whether you are able to compute WAIC now.

@adoe21
Copy link
Author

adoe21 commented Sep 17, 2024

Hello again,

I ended up getting access to GPUs and began running the GPU pipeline, which was significantly faster, and helped me avoid any of these issues. However I am now getting an error when I run the share_topic_output step. The error looks like this,
Traceback (most recent call last): File "/home/groups/CEDAR/epiconfig/comparison/shareTopic/SHARE-Topic/PBMC_shareTopic.py", line 49, in <module> rna, m_theta, m_lam, m_phi = st_obj.share_topic_output(rna_adata, n_topics, path, burnin_samples=50) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/groups/CEDAR/epiconfig/comparison/shareTopic/SHARE-Topic/SHARE_topic/SHARE_topic.py", line 444, in share_topic_output m_theta, m_lam, m_phi = self.read_samples (t, path, burnin_samples) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/groups/CEDAR/epiconfig/comparison/shareTopic/SHARE-Topic/SHARE_topic/SHARE_topic.py", line 419, in read_samples theta = torch.load(str(path)+"/theta_"+str(t)+".txt",map_location=torch.device('cpu')) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/groups/CEDAR/doe/miniconda3/envs/SHAREtopic/lib/python3.12/site-packages/torch/serialization.py", line 1065, in load with _open_file_like(f, 'rb') as opened_file: ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/groups/CEDAR/doe/miniconda3/envs/SHAREtopic/lib/python3.12/site-packages/torch/serialization.py", line 468, in _open_file_like return _open_file(name_or_buffer, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/groups/CEDAR/doe/miniconda3/envs/SHAREtopic/lib/python3.12/site-packages/torch/serialization.py", line 449, in __init__ super().__init__(open(name, mode)) ^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'parameters/50_topics/theta_50.txt'

The path I am specifying in the command is the same path I have in my st_obj.fit() command. However when I look at that path the only saved files are a atac_share_topic.txt file and an rna_share_topic.txt file. I looked for this theta_50 file and didn't see it anywhere, is it supposed to be saved from a different function or is it getting saved to a different location?

@adoe21
Copy link
Author

adoe21 commented Oct 1, 2024

@Nour899 do you have an idea of what might be happening here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants