-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Platform tutorial: nf-core/rnaseq full #131
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for seqera-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
…into rnaseq-full-guide
…into rnaseq-full-guide
…into rnaseq-full-guide
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I like the flow but it starts to drift into a descriptive document. This means it becomes quite verbose and isn't obvious what steps to do next.
I would tighten up the second half to be more focused. You could probably achieve this by hacking bits out without adding too much so I don't think it's a huge job.
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
```console | ||
# Create MDS plot | ||
# a. Display in RStudio | ||
plotMDS(y, col=as.numeric(factor(targets$Group)), labels=targets$Group) | ||
legend("topright", legend=levels(factor(targets$Group)), | ||
col=1:nlevels(factor(targets$Group)), pch=20) | ||
|
||
# b. Save MDS plot to file (change `png` to `pdf` to create a PDF file) | ||
png("MDS_plot.png", width = 800, height = 600) | ||
plotMDS(y, col=as.numeric(factor(targets$Group)), labels=targets$Group) | ||
legend("topright", legend=levels(factor(targets$Group)), | ||
col=1:nlevels(factor(targets$Group)), pch=20) | ||
dev.off() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's format the R code. You should be able to use cmd+shift+r in Rstudio or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamrtalbot I need to run the R script again to used salmon.merged.gene_counts_length_scaled.tsv either way, so will reformat while I'm in there. Just to confirm, by reformat you mean more than just swapping out console
with r
in the Markdown code blocks, right?
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
| **Pipeline step** | **Tools** | **Resource needs** | **Description** | | ||
|-------------------------------------|---------------------------|------------------------------|---------------------------------------------------------------------------------------------------| | ||
| **Quality Control (QC)** | FastQC, MultiQC | Moderate CPU, low memory | Initial quality checks of raw reads to assess sequencing quality and identify potential issues. | | ||
| **Read Trimming** | Trim Galore! | Moderate CPU, moderate memory| Removal of adapter sequences and low-quality bases to prepare reads for alignment. | | ||
| **Read Alignment** | HISAT2, STAR | High CPU, high memory | Alignment of trimmed reads to a reference genome, typically the most resource-intensive step. | | ||
| **Quantification** | featureCounts, Salmon | Moderate CPU, moderate memory| Counting the number of reads mapped to each gene or transcript to measure expression levels. | | ||
| **Differential Expression Analysis**| DESeq2, edgeR | Low CPU, moderate memory | Statistical analysis to identify genes with significant changes in expression between conditions. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we just put real numbers here?
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
1. Read and convert the count data and sample information: | ||
|
||
:::info | ||
Replace `<PATH_TO_YOUR_COUNTS_FILE>` and `<PATH_TO_YOUR_SAMPLE_INFO_FILE>` with the paths to your `salmon.merged.gene_counts.tsv` and `sampleinfo.txt` files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how pedantic we want to be here about correctness in analysis.
There can be important differences in effective gene length across conditions, which we normally account for in analysis. Using salmon.merged.gene_counts.tsv will just ignore that effect.
We can either model those length differences (preferable, but would probably add some unnecessary complexity here), or otherwise just use salmon.merged.gene_counts_length_scaled.tsv (which is probably the simplest thing to do here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does RNA-Seq produce transcript-level estimates of gene quantification we can use (tximport)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow does give you the raw outputs from Salmon / Kallisto, which are at the transcript level, and which is what tximport reads.
But it also uses tximport internally to produce those count matrices (salmon.merged.gene_counts_length_scaled.tsv, salmon.merged.gene_counts.tsv), and provides gene lengths (salmon.merged.gene_lengths.tsv) that can be used as offsets directly in downstream analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd err on the side of pedantic. So need to run the script again with salmon.merged.gene_counts_length_scaled.tsv and update the GIFs and steps. Thanks for the detailed feedback gents!
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Jonathan Manning <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
platform_versioned_docs/version-23.4/getting-started/rnaseq.mdx
Outdated
Show resolved
Hide resolved
Co-authored-by: Florian Wuennemann <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Florian Wuennemann <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Florian Wuennemann <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
Co-authored-by: Adam Talbot <[email protected]> Signed-off-by: Llewellyn vd Berg <[email protected]>
…into rnaseq-full-guide
@@ -18,7 +18,7 @@ Platform offers two methods to import pipelines to your workspace Launchpad — | |||
|
|||
![Seqera Pipelines overview](assets/seqera-pipelines-overview.gif) | |||
|
|||
To import the `nf-core/rnaseq` pipeline: | |||
To import the `nf-core-rnaseq` pipeline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To import the `nf-core-rnaseq` pipeline: | |
To import the `nf-core/rnaseq` pipeline: |
The gif above shows nf-core/rnaseq
as the title, is there a reason you wanted to change it to nf-core-rnaseq
?
Adds an nf-core/rnaseq 1-page tutorial, including compute environment recommendation and config, importing via Seqera Pipelines, adding data via DE and datasets, pipeline launch and monitoring, results analysis with Data Studios, pipeline optimization, and pipeline requirements based on input dataset size and benchmarking.