Title: | Analysis of RNA Crosslinking Data |
---|---|
Description: | Analysis of RNA crosslinking data for RNA structure prediction. The package is suitable for the analysis of RNA structure cross-linking data and chemical probing data. |
Authors: | Jonathan Price [aut, cre] |
Maintainer: | Jonathan Price <[email protected]> |
License: | GPL-3 |
Version: | 0.1.4 |
Built: | 2025-02-21 06:19:19 UTC |
Source: | https://github.com/jlp-bioinf/rnacrosslinkoo |
Extract the cluster coordinates in granges format
clusterGrangesList(x)
clusterGrangesList(x)
x |
A rnaCrosslinkDataSet object |
A list of Granges objects showing the positions of each cluster, one entry for each sample
cds = makeExamplernaCrosslinkDataSet() clusterGrangesList(cds)
cds = makeExamplernaCrosslinkDataSet() clusterGrangesList(cds)
Set new clusterGrangesList slot
clusterGrangesList(x) <- value
clusterGrangesList(x) <- value
x |
A rnaCrosslinkDataSet object |
value |
A replacement |
No return - Sets a new clusterGrangesList slot
cds = makeExamplernaCrosslinkDataSet() newclusterGrangesList <- clusterGrangesList(cds) clusterGrangesList(cds) <- newclusterGrangesList
cds = makeExamplernaCrosslinkDataSet() newclusterGrangesList <- clusterGrangesList(cds) clusterGrangesList(cds) <- newclusterGrangesList
This method prints a table showing the number of clusters in each step of the analysis
clusterNumbers(knowClusteredCds, rna)
clusterNumbers(knowClusteredCds, rna)
knowClusteredCds |
A rnaCrosslinkDataSet object after clustering has been performed |
rna |
The RNA ID of interest - use rna(cdsObject). |
A data.frame shoing the number of clusters for each sample
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 1) clusterNumbers(clusteredCds)
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 1) clusterNumbers(clusteredCds)
This method clusters the duplexes.
clusterrnaCrosslink(cds, cores = 3, stepCount = 2, clusterCutoff = 20)
clusterrnaCrosslink(cds, cores = 3, stepCount = 2, clusterCutoff = 20)
cds |
rnaCrosslinkDataSet object created with rnaCrosslinkDataSet |
cores |
numeric - The number of cores to use |
stepCount |
Stringency for clustering |
clusterCutoff |
The minimum number of reads a cluster requires |
A rnaCrosslinkDataSet
object
cds = makeExamplernaCrosslinkDataSet() clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0)
cds = makeExamplernaCrosslinkDataSet() clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0)
Extract the cluster coordinates with fold prediciton in data frame format
clusterTableFolded(x)
clusterTableFolded(x)
x |
A rnaCrosslinkDataSet object |
A table showing the vienna structures of each cluster
cds = makeExamplernaCrosslinkDataSet() clusterTableFolded(cds)
cds = makeExamplernaCrosslinkDataSet() clusterTableFolded(cds)
Extract the cluster coordinates in data frame format
clusterTableList(x)
clusterTableList(x)
x |
A rnaCrosslinkDataSet object |
A list of tables showing the vienna structures of each cluster
cds = makeExamplernaCrosslinkDataSet() clusterTableList(cds)
cds = makeExamplernaCrosslinkDataSet() clusterTableList(cds)
Set new clusterTableList slot
clusterTableList(x) <- value
clusterTableList(x) <- value
x |
A rnaCrosslinkDataSet object |
value |
A replacement |
No return - Sets a new clusterTableList slot
cds = makeExamplernaCrosslinkDataSet() newclusterGrangesList <- clusterTableList(cds) clusterTableList(cds) <- newclusterGrangesList
cds = makeExamplernaCrosslinkDataSet() newclusterGrangesList <- clusterTableList(cds) clusterTableList(cds) <- newclusterGrangesList
This method compares the current object to a know structure.run
trimClusters()
on the rnaCrosslinkDataSet
first
compareKnown(trimmedClusters, knownMat, type)
compareKnown(trimmedClusters, knownMat, type)
trimmedClusters |
a |
knownMat |
Matrix - A marix(ncol = lengthRNA,nrow = lengthRNA) where a value in matrix[x,y] would indicate a known interation between nucleotide x and nucleotide y |
type |
string - the Analysis stage of clusters you would like to compare you can find available types by just running the objects name |
Returns a rnaCrosslinkClusteredDataSet
object
The 3 attributes matrixList, clusterTableList and clusterGrangesList
will gain the types
"known" and "novel" and "knownAndNovel"
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) knownMat = matrix(0, ncol = rnaSize(cds), nrow = rnaSize(cds)) knownMat[7,27] = 1 # use compare known to gett he known and not know clusters knowClusteredCds = compareKnown(clusteredCds, knownMat, "original") clusterNumbers(knowClusteredCds)
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) knownMat = matrix(0, ncol = rnaSize(cds), nrow = rnaSize(cds)) knownMat[7,27] = 1 # use compare known to gett he known and not know clusters knowClusteredCds = compareKnown(clusteredCds, knownMat, "original") clusterNumbers(knowClusteredCds)
This method compares the predicted structures to a set of known interactions
compareKnownStructures(foldedCds, file)
compareKnownStructures(foldedCds, file)
foldedCds |
|
file |
a two column file with column header i and j with numeric values showing nucleoide i binds to nucleotide j |
Returns a dataframe
a tables showing the number of predicted interactions and their agreement
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) # make an example table of "know" interactions file = data.frame(V1 = c(6), V2 = c(80)) compareKnownStructures(foldedCds,file) ## End(Not run)
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) # make an example table of "know" interactions file = data.frame(V1 = c(6), V2 = c(80)) compareKnownStructures(foldedCds,file) ## End(Not run)
Produces a list list of 2 elemnts 'transcript' and 'family' Each element contains a table with the counts for each RNA in each sample that interact with the target RNA
featureInfo(cds)
featureInfo(cds)
cds |
a |
A list - Feature level and transcript level counts for each sample
cds = makeExamplernaCrosslinkDataSet() featureInfo(cds)
cds = makeExamplernaCrosslinkDataSet() featureInfo(cds)
Folds the clusters using Vienna RNAfold
findBasePairsRNAcoFold2( startPos1, endPos1, seq1, startPos2, endPos2, seq2, fasta, shape )
findBasePairsRNAcoFold2( startPos1, endPos1, seq1, startPos2, endPos2, seq2, fasta, shape )
startPos1 |
Start of the cluster side x |
endPos1 |
End of the cluster side x |
seq1 |
Sequence of x |
startPos2 |
Start of the cluster side y |
endPos2 |
End of the cluster side y |
seq2 |
Sequence of y |
fasta |
|
shape |
shape reactivities |
A table of clusters and coordinates with folds
Folds the clusters using Vienna RNA duplex
findBasePairsRNAfold(startPos, endPos, seqs, fasta, shape)
findBasePairsRNAfold(startPos, endPos, seqs, fasta, shape)
startPos |
Start of the cluster side x |
endPos |
End of the cluster side x |
seqs |
Sequence of x |
fasta |
|
shape |
shape reactivities |
A table of clusters and coordinates with folds
Folds the clusters using Vienna RNA duplex
findBasePairsRNAfold2(startPos, endPos, seqs, fasta)
findBasePairsRNAfold2(startPos, endPos, seqs, fasta)
startPos |
Start of the cluster side x |
endPos |
End of the cluster side x |
seqs |
Sequence of x |
fasta |
|
A table of clusters and coordinates with folds
This methods folds an ensebl of structures for the whole RNA or chosen region
of the RNA. See rnaCrosslinkDataSet
for slot information.
foldrnaCrosslink( cdsObject, rnaRefs, start, end, evCutoff = 1, ensembl = 50, constraintNumber = 20, shape = 0 )
foldrnaCrosslink( cdsObject, rnaRefs, start, end, evCutoff = 1, ensembl = 50, constraintNumber = 20, shape = 0 )
cdsObject |
rnaCrosslinkDataSet object created with rnaCrosslinkDataSet |
rnaRefs |
named List - a list with named elements that correspond to the
.RNA of interest. The element of the list must be a fasta file that has
been read with |
start |
Start of segmnent to fold |
end |
End of segmnent to fold |
evCutoff |
Mininum number of read support for contraint to be included in folding |
ensembl |
Number of structures to Nake |
constraintNumber |
Number of constraints to add to each final fold |
shape |
shape reactivities (0 for no constraints) |
a rnaCrosslinkDataSet object
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) trimmedClusters = trimClusters(clusteredCds = clusteredCds, trimFactor = 1, clusterCutoff = 0) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) foldedCds ## End(Not run)
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) trimmedClusters = trimClusters(clusteredCds = clusteredCds, trimFactor = 1, clusterCutoff = 0) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) foldedCds ## End(Not run)
Makes and adjacency matrix list (for clustering)
getAdjacancyMat(InputGranges, nucletideOrPerc, cutoff)
getAdjacancyMat(InputGranges, nucletideOrPerc, cutoff)
InputGranges |
list created with InputToGRanges (but just the gap section of the list) |
nucletideOrPerc |
measure difference by percentage or nucleotides |
cutoff |
The maximum difference before giving these two gaps 0 |
Makes and adjacency matrix list (for clustering)
A list of Adjacancy matrices
Decides if a cluster is long or short range then either grabs the whole sequence or the sequence of the two sides of the interaction separately.
getClusterClusterShortRangeWhole(cluster, seq)
getClusterClusterShortRangeWhole(cluster, seq)
cluster |
cluster positions |
seq |
sequence of transcript |
The same table with an extra column
Get data is more generic method for retrieving data from the object and returns a list, the number of entries in the list is number of samples in the dataset and the list contain entries of the data type and analysis stage you select.
getData(x, data, type)
getData(x, data, type)
x |
A rnaCrosslinkDataSet object |
data |
The data type to return <InputFiles | matrixList | clusterGrangesList | clusterTableList> |
type |
The analysis stage <original | noHost | originalClusters | trimmedClusters> |
A list of the chosen data type - one entry for each sample
cds = makeExamplernaCrosslinkDataSet() getData(cds, 'matrixList','original')
cds = makeExamplernaCrosslinkDataSet() getData(cds, 'matrixList','original')
This method returns a table of interactions of an RNA (interactor) on the RNA of interest.
getInteractions(cds, interactors)
getInteractions(cds, interactors)
cds |
a |
interactors |
A vector containing the names of RNAs to show interactions with |
A table showing the read coverage of the specified interacting RNAs
cds = makeExamplernaCrosslinkDataSet() getInteractions(cds, c("transcript1","transcript2"))
cds = makeExamplernaCrosslinkDataSet() getInteractions(cds, c("transcript1","transcript2"))
Make a matrix of contact interactions
getMatrices(InputList, rna, size)
getMatrices(InputList, rna, size)
InputList |
the original InputList created with readInputFiles or subsetInputList |
rna |
the RNA of interest that you want to subset |
size |
The size of the RNA |
Function used to create a list of matrices for plotting with plotMatrixList or plotMatrixListFull, the output list will be same as the input except for an extra list layer for the specific RNA
A list of matrices
This method prints interactions of the RNA of interest on another RNA transcript.
getReverseInteractions(cds, interactor)
getReverseInteractions(cds, interactor)
cds |
a |
interactor |
The rna to show interactions with |
A long format table shoing the read coverage of chosen RNA
cds = makeExamplernaCrosslinkDataSet() getReverseInteractions(cds, 'transcript2')
cds = makeExamplernaCrosslinkDataSet() getReverseInteractions(cds, 'transcript2')
Extract the indeces for each group for the instance
group(x)
group(x)
x |
A rnaCrosslinkDataSet object |
A list - The indices of the sample in the control and sample groups
cds = makeExamplernaCrosslinkDataSet() group(cds)
cds = makeExamplernaCrosslinkDataSet() group(cds)
Extract the data in original format
InputFiles(x)
InputFiles(x)
x |
A rnaCrosslinkDataSet object |
A list of tables in the original input format, one entry for each sample
cds = makeExamplernaCrosslinkDataSet() InputFiles(cds)
cds = makeExamplernaCrosslinkDataSet() InputFiles(cds)
This function is useful to turn a list of Input data into lists of GRanges It creates a list for each sample one for the left side one for the right side and one for the gap in the middle.
InputToGRanges(InputList, rna)
InputToGRanges(InputList, rna)
InputList |
the original InputList created with readInputFiles or subsetInputList |
rna |
The rna of interest |
A list of GRanges data in Input format
Creat a minimal example rnaCrosslinkdataSetObject
makeExamplernaCrosslinkDataSet()
makeExamplernaCrosslinkDataSet()
An example rnaCrosslinkDataSet objext
cds = makeExamplernaCrosslinkDataSet()
cds = makeExamplernaCrosslinkDataSet()
Extract the contact matrices
matrixList(x)
matrixList(x)
x |
A rnaCrosslinkDataSet object |
A list of contract matrices, one entry for each sample
cds = makeExamplernaCrosslinkDataSet() matrixList(cds)
cds = makeExamplernaCrosslinkDataSet() matrixList(cds)
Set new matrixList slot
matrixList(x) <- value
matrixList(x) <- value
x |
A rnaCrosslinkDataSet object |
value |
A replacement |
No return - Sets a new matrixList slot
cds = makeExamplernaCrosslinkDataSet() newMatrixList <- matrixList(cds) matrixList(cds) <- newMatrixList
cds = makeExamplernaCrosslinkDataSet() newMatrixList <- matrixList(cds) matrixList(cds) <- newMatrixList
Plot a heatmap that plots the agreements between replicates after clusterrnaCrosslink has been performed
plotClusterAgreement(cds, analysisStage = "originalClusters")
plotClusterAgreement(cds, analysisStage = "originalClusters")
cds |
A rnaCrosslinkDataSet object |
analysisStage |
The stage of the analysis to plot |
A heatmap of the agreement between replicates in the analysis stage chosen
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) plotClusterAgreement(cds)
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) plotClusterAgreement(cds)
Plot a heatmap that plots the agreements between replicates after clusterrnaCrosslink has been performed
plotClusterAgreementHeat(cds, analysisStage = "originalClusters")
plotClusterAgreementHeat(cds, analysisStage = "originalClusters")
cds |
A rnaCrosslinkDataSet object |
analysisStage |
The stage of the analysis to plot |
A heatmap of the agreement between replicates in the analysis stage chosen
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) plotClusterAgreementHeat(cds)
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) plotClusterAgreementHeat(cds)
Plots a contact map of two chosen samples for chosen stages in the analysis, with each chosen sample on separate halves of the contact map
plotCombinedMatrix( cds, type1 = "original", type2 = "original", sample1 = 1, sample2 = 1, directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3, returnData = FALSE )
plotCombinedMatrix( cds, type1 = "original", type2 = "original", sample1 = 1, sample2 = 1, directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3, returnData = FALSE )
cds |
A rnaCrosslinkDataSet object |
type1 |
The analysis stage to plot on the upper half of the heatmap |
type2 |
The analysis stage to plot on the lower half of the heatmap |
sample1 |
The sample number to plot on the upper half of the heatmap |
sample2 |
The sample number to plot on the upper half of the heatmap |
directory |
An output directory for the heatmap (use 0 for no output) |
a |
To make a subsetted plot (left value on x) |
b |
To make a subsetted plot (right value on x) |
c |
To make a subsetted plot (left value on y) |
d |
To make a subsetted plot (right value on y) |
h |
Height of image (inches) (only useful if plotting) |
returnData |
if TRUE matrix is returned instead of plotting |
A heatmap of the reads of the chosen sample numbers, in the analysis stages chosen, with each chosen sample on a separate half of the heatmap
cds = makeExamplernaCrosslinkDataSet() plotCombinedMatrix(cds, type1 = "original", type2 = "noHost", b = rnaSize(cds), d = rnaSize(cds))
cds = makeExamplernaCrosslinkDataSet() plotCombinedMatrix(cds, type1 = "original", type2 = "noHost", b = rnaSize(cds), d = rnaSize(cds))
This method plots two structures chosen from the plotEnsemblePCA method
plotComparisonArc(foldedCds, s1 = "s1", s2 = "s2", n1 = 1, n2 = 2)
plotComparisonArc(foldedCds, s1 = "s1", s2 = "s2", n1 = 1, n2 = 2)
foldedCds |
|
s1 |
sample of structure 1 |
s2 |
sample of structure 2 |
n1 |
number of structure from first sample |
n2 |
number of structure from first sample |
an ark diagram of the two strcutures
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) plotComparisonArc(foldedCds,"s1","s1",1,3) ## End(Not run)
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) plotComparisonArc(foldedCds,"s1","s1",1,3) ## End(Not run)
This method plots a PCA of the ensembl
plotEnsemblePCA(foldedCds, labels = TRUE, split = TRUE)
plotEnsemblePCA(foldedCds, labels = TRUE, split = TRUE)
foldedCds |
|
labels |
plot with labels or not (TRUE/FALSE) |
split |
split the plot using facets based on the samples (TRUE/FALSE) |
a PCA plot of the ensembl
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) plotEnsemblePCA(foldedCds) ## End(Not run)
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) plotEnsemblePCA(foldedCds) ## End(Not run)
Plots a contact map of interactions of each sample of an RNA (interactor) on the RNA of interest
plotInteractions( cds, rna, interactor, directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
plotInteractions( cds, rna, interactor, directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
cds |
A rnaCrosslinkDataSet object |
rna |
The RNA of interest |
interactor |
The RNA to show interactions with |
directory |
An output directory for the heatmap (use 0 for no output) |
a |
To make a subsetted plot (left value on x) |
b |
To make a subsetted plot (right value on x) (use 'max' to plot the whole RNA strand length) |
c |
To make a subsetted plot (left value on y) |
d |
To make a subsetted plot (right value on y) (use 'max' to plot the whole RNA strand length) |
h |
Height of image (inches) (only useful if plotting) |
A heatmap of interactions of the RNA (interactor) on the RNA of interest
cds = makeExamplernaCrosslinkDataSet() plotInteractions(cds, rna = "transcript1", interactor = "transcript2", b = "max", d = "max")
cds = makeExamplernaCrosslinkDataSet() plotInteractions(cds, rna = "transcript1", interactor = "transcript2", b = "max", d = "max")
Plots a contact map of interactions of all samples of an RNA (interactor) on the RNA of interest
plotInteractionsAverage( cds, rna, interactor, directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
plotInteractionsAverage( cds, rna, interactor, directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
cds |
A rnaCrosslinkDataSet object |
rna |
The RNA of interest |
interactor |
The RNA to show interactions with |
directory |
An output directory for the heatmap (use 0 for no output) |
a |
To make a subsetted plot (left value on x) |
b |
To make a subsetted plot (right value on x) (use 'max' to plot the whole RNA strand length) |
c |
To make a subsetted plot (left value on y) |
d |
To make a subsetted plot (right value on y) (use 'max' to plot the whole RNA strand length) |
h |
Height of image (inches) (only useful if plotting) |
A heatmap of interactions of all samples of the RNA (interactor) on the RNA of interest
cds = makeExamplernaCrosslinkDataSet() plotInteractionsAverage(cds, rna = "transcript1", interactor = "transcript2", b = "max", d = "max")
cds = makeExamplernaCrosslinkDataSet() plotInteractionsAverage(cds, rna = "transcript1", interactor = "transcript2", b = "max", d = "max")
Plots a number of contact maps to file of each sample for a stage in the analysis
plotMatrices( cds, type = "original", directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
plotMatrices( cds, type = "original", directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
cds |
A rnaCrosslinkDataSet object |
type |
The analysis stage to plot |
directory |
An output directory for the heatmap (use 0 for no output) |
a |
To make a subsetted plot (left value on x) |
b |
To make a subsetted plot (right value on x) |
c |
To make a subsetted plot (left value on y) |
d |
To make a subsetted plot (right value on y) |
h |
Height of image (inches) ( only useful if plotting) |
A heatmap of the reads in the analysis stage chosen
cds = makeExamplernaCrosslinkDataSet() plotMatrices(cds, b = rnaSize(cds), d = rnaSize(cds))
cds = makeExamplernaCrosslinkDataSet() plotMatrices(cds, b = rnaSize(cds), d = rnaSize(cds))
Plots a contact map of all samples for two chosen stages in the analysis, with each chosen stage on separate halves of the contact map
plotMatricesAverage( cds, type1 = "original", type2 = "blank", directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
plotMatricesAverage( cds, type1 = "original", type2 = "blank", directory = 0, a = 1, b = 50, c = 1, d = 50, h = 3 )
cds |
A rnaCrosslinkDataSet object |
type1 |
The analysis stage to plot on the upper half of the heatmap (use 'blank' to leave this half blank) |
type2 |
The analysis stage to plot on the lower half of the heatmap (use 'blank' to leave this half blank) |
directory |
An output directory for the heatmap (use 0 for no output) |
a |
To make a subsetted plot (left value on x) |
b |
To make a subsetted plot (right value on x) |
c |
To make a subsetted plot (left value on y) |
d |
To make a subsetted plot (right value on y) |
h |
Height of image (inches) ( only useful if plotting) |
A heatmap of the reads in the two analysis stages chosen, with each chosen stage on a separate half of the heatmap
cds = makeExamplernaCrosslinkDataSet() plotMatricesAverage(cds, b = rnaSize(cds), d = rnaSize(cds))
cds = makeExamplernaCrosslinkDataSet() plotMatricesAverage(cds, b = rnaSize(cds), d = rnaSize(cds))
This method plots a structures chosen from the plotEnsemblePCA method
plotStructure(foldedCds, rnaRefs, s = "s1", n = 1)
plotStructure(foldedCds, rnaRefs, s = "s1", n = 1)
foldedCds |
|
rnaRefs |
A fasta of the transcript (made with seqinr::read.fasta) |
s |
sample of structure |
n |
number of structure |
a diagram of the predicted structure
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) plotStructure(foldedCds,rnaRefs,"s1",3) ## End(Not run)
## Not run: cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds = cds, cores = 3, stepCount = 2, clusterCutoff = 1) trimmedClusters = trimClusters(clusteredCds = clusteredCds,trimFactor = 1, clusterCutoff = 1) fasta = paste(c(rep('A',25), rep('T',25), rep('A',10), rep('T',23)),collapse = "") header = '>transcript1' fastaFile = tempfile() writeLines(paste(header,fasta,sep = "\n"),con = fastaFile) rnaRefs = list() rnaRefs[[rnas(cds)]] = read.fasta(fastaFile) rnaRefs foldedCds = foldrnaCrosslink(trimmedClusters, rnaRefs = rnaRefs, start = 1, end = 83, shape = 0, ensembl = 5, constraintNumber = 1, evCutoff = 1) plotStructure(foldedCds,rnaRefs,"s1",3) ## End(Not run)
Makes a table with the coordinates of the clusters
printClustersFast(dir, clustering, highest_clusters, left, right)
printClustersFast(dir, clustering, highest_clusters, left, right)
dir |
the directory that contains the *Inputrids.Input files |
clustering |
The output from the iGraph function cluster_walktrap for the (made with adjacency matrix input) |
highest_clusters |
The cluster you are interested in keeping |
left |
list created with InputToGRanges (but just the left section of the list) |
right |
list created with InputToGRanges (but just the right section of the list) |
Does the same as printClusters but is a lot faster and does not create plots of each cluster
A table of clusters and coordinates
This method prints a table showing the number of duplexes in the clusters in each step of the analysis
readNumbers(knowClusteredCds, rna)
readNumbers(knowClusteredCds, rna)
knowClusteredCds |
A rnaCrosslinkDataSet object after clustering has been performed |
rna |
The RNA ID of interest - use rna(cdsObject). |
A data.frame shoing the number of reads in clusters for each sample
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 1) readNumbers(clusteredCds)
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 1) readNumbers(clusteredCds)
rnaCrosslinkDataSet
objects are used to store the input meta-data, data and
create a framework for the storage of results. Whilst creating the object,
the original Input files are also filtered for the RNA of interest. Check the
package vignette for more information.
rnaCrosslinkDataSet( rnas, rnaSize = 0, sampleTable, subset = "all", sample = "all" )
rnaCrosslinkDataSet( rnas, rnaSize = 0, sampleTable, subset = "all", sample = "all" )
rnas |
vector - The names of the RNA interest, these must be displayed the same way as in the input Input Files. |
rnaSize |
named list - The sizes (nt) of the RNAs of interest, the list
elements must have same names as the |
sampleTable |
string - The address of the sample table, the sample table must have 4 columns, fileName (the full path and file name of the input Input file for each sample ), group ("s" - sample or "c" - control), sample (1,2,3, etc), sampleName (must be unique). |
subset |
a vector of 4 values to subset based on structural read size. c(l-min,l-max,r-min,r-max) |
sample |
The number of reads to sample for each sample. |
A rnaCrosslinkDataSet object.
clusterTableFolded
table - a table similar to the clusterTableList
it contains coordinates of the clusters along with vienna format fold and
RNA sequences for each cluster
clusterTableList
List - Follows the pattern for list slots of rnaCrosslinkDataSet
objects, matrixList(cds)[[rna]][[type]][[sample]]
. contains a table
with coordinates and information about the clusters identified
clusterGrangesList
List - Follows the pattern for list slots of rnaCrosslinkDataSet
objects, matrixList(cds)[[rna]][[type]][[sample]]
. contains GRanges
objects of the original duplexes with their cluster membership
sampleTable
table - Column names; fileName, group (s or c), sample (1,2,3, etc), sampleName (must be unique)
rnas
string - a single RNA to analyse - must be present in rnas(cdsObject)
rnaSize
if set to 0 this will be calculated
matrixList
List - Follows the pattern for list slots of rnaCrosslinkDataSet
objects, matrixList(cds)[[rna]][[type]][[sample]]
. Contains a set
of contact matrices, each cell contains the number of duplexes identified
for position x,y.
InputFiles
List - Follows the pattern for list slots of rnaCrosslinkDataSet
objects, InputFiles(cds)[[rna]][[type]][[sample]]
. Contains a set of
tables, these are the original Input files that were read in.
interactionTable
Table of interactions discovered in step1 of the folding
viennaStructures
List of vienna format structures from final prediction
dgs
List of free energies
# make example input cds = makeExamplernaCrosslinkDataSet() cds
# make example input cds = makeExamplernaCrosslinkDataSet() cds
get a plot fo the read lengths and transcripts in the dataset The fucntion will make 1 pdf and 2 text file in the directory provided
rnaCrosslinkQC(sampleTable, directory, topTranscripts = TRUE)
rnaCrosslinkQC(sampleTable, directory, topTranscripts = TRUE)
sampleTable |
string - The address of the sample table, the sample table must have 4 columns, fileName (the full path and file name of the input Input file for each sample ), group ("s" - sample or "c" - control), sample (1,2,3, etc), sampleName (must be unique). |
directory |
A directory address to write the files |
topTranscripts |
If FALSE a table of top trandscirpts will not be written to file |
ggplot and txt file
c4 = c(rep("transcript1",100),rep("transcript2",100) ) c10 = c(rep("transcript1",200) ) c1 = 1:200 c2 = rep(paste(rep("A", 40), collapse = ""),200) c3 = rep(".",200) c9 = rep(".",200) c15 = rep(".",200) c5 = rep(1,200) c11 = rep(21,200) c6 = rep(20,200) c12= rep(40,200) # short distance 50 c7 = sample(1:5, 50, replace = TRUE) c8 = sample(20:25, 50, replace = TRUE) c13 = sample(20:25, 50, replace = TRUE) c14 = sample(40:45, 50, replace = TRUE) # long distance 50 c7 = c(c7,sample(1:5, 50, replace = TRUE)) c8 = c(c8,sample(20:25, 50, replace = TRUE)) c13 = c(c13,sample(60:70, 50, replace = TRUE)) c14 = c(c14,sample(80:83, 50, replace = TRUE)) # inter RNA 100 c7 = c(c7,sample(1:5, 100, replace = TRUE)) c8 = c(c8,sample(20:25, 100, replace = TRUE)) c13 = c(c13,sample(1:5, 100, replace = TRUE)) c14 = c(c14,sample(20:25, 100, replace = TRUE)) exampleInput = data.frame(V1 = c1, V2 = c2, V3 = c3, V4 = c4, V5 = as.numeric(c5), V6 = as.numeric(c6), V7 = as.numeric(c7), V8 = as.numeric(c8), V9 = c9, V10 = c10, V11 = as.numeric(c11), V12 = as.numeric(c12), V13 = as.numeric(c13), V14 = as.numeric(c14), V15 = c15) file = tempfile() write.table(exampleInput, file = file, quote = FALSE, row.names = FALSE, sep = "\t", col.names = FALSE) c4 = c(rep("transcript1",55),rep("transcript2",90) ) c10 = c(rep("transcript1",145) ) c1 = 1:145 c2 = rep(paste(rep("A", 40), collapse = ""),145) c3 = rep(".",145) c9 = rep(".",145) c15 = rep(".",145) c5 = rep(1,145) c11 = rep(21,145) c6 = rep(20,145) c12= rep(40,145) # short distance 55 c7 = sample(1:5, 55, replace = TRUE) c8 = sample(20:25, 55, replace = TRUE) c13 = sample(20:25, 55, replace = TRUE) c14 = sample(40:45, 55, replace = TRUE) # inter RNA 100 c7 = c(c7,sample(1:40, 90, replace = TRUE)) c8 = c(c8,sample(20:75, 90, replace = TRUE)) c13 = c(c13,sample(1:40, 90, replace = TRUE)) c14 = c(c14,sample(20:75, 90, replace = TRUE)) exampleInput = data.frame(V1 = c1, V2 = c2, V3 = c3, V4 = c4, V5 = as.numeric(c5), V6 = as.numeric(c6), V7 = as.numeric(c7), V8 = as.numeric(c8), V9 = c9, V10 = c10, V11 = as.numeric(c11), V12 = as.numeric(c12), V13 = as.numeric(c13), V14 = as.numeric(c14), V15 = c15) file2 = tempfile() write.table(exampleInput, file = file2, quote = FALSE, row.names = FALSE, sep = "\t", col.names = FALSE) # Set up the sample table. ---- sampleTabler1 = c(file, "s", "1", "s1") sampleTabler2 = c(file2, "c", "1", "c1") # make the sample table sampleTable2 = rbind.data.frame(sampleTabler1, sampleTabler2) # add the column names colnames(sampleTable2) = c("file", "group", "sample", "sampleName") rnaCrosslinkQC(sampleTable2,tempdir())
c4 = c(rep("transcript1",100),rep("transcript2",100) ) c10 = c(rep("transcript1",200) ) c1 = 1:200 c2 = rep(paste(rep("A", 40), collapse = ""),200) c3 = rep(".",200) c9 = rep(".",200) c15 = rep(".",200) c5 = rep(1,200) c11 = rep(21,200) c6 = rep(20,200) c12= rep(40,200) # short distance 50 c7 = sample(1:5, 50, replace = TRUE) c8 = sample(20:25, 50, replace = TRUE) c13 = sample(20:25, 50, replace = TRUE) c14 = sample(40:45, 50, replace = TRUE) # long distance 50 c7 = c(c7,sample(1:5, 50, replace = TRUE)) c8 = c(c8,sample(20:25, 50, replace = TRUE)) c13 = c(c13,sample(60:70, 50, replace = TRUE)) c14 = c(c14,sample(80:83, 50, replace = TRUE)) # inter RNA 100 c7 = c(c7,sample(1:5, 100, replace = TRUE)) c8 = c(c8,sample(20:25, 100, replace = TRUE)) c13 = c(c13,sample(1:5, 100, replace = TRUE)) c14 = c(c14,sample(20:25, 100, replace = TRUE)) exampleInput = data.frame(V1 = c1, V2 = c2, V3 = c3, V4 = c4, V5 = as.numeric(c5), V6 = as.numeric(c6), V7 = as.numeric(c7), V8 = as.numeric(c8), V9 = c9, V10 = c10, V11 = as.numeric(c11), V12 = as.numeric(c12), V13 = as.numeric(c13), V14 = as.numeric(c14), V15 = c15) file = tempfile() write.table(exampleInput, file = file, quote = FALSE, row.names = FALSE, sep = "\t", col.names = FALSE) c4 = c(rep("transcript1",55),rep("transcript2",90) ) c10 = c(rep("transcript1",145) ) c1 = 1:145 c2 = rep(paste(rep("A", 40), collapse = ""),145) c3 = rep(".",145) c9 = rep(".",145) c15 = rep(".",145) c5 = rep(1,145) c11 = rep(21,145) c6 = rep(20,145) c12= rep(40,145) # short distance 55 c7 = sample(1:5, 55, replace = TRUE) c8 = sample(20:25, 55, replace = TRUE) c13 = sample(20:25, 55, replace = TRUE) c14 = sample(40:45, 55, replace = TRUE) # inter RNA 100 c7 = c(c7,sample(1:40, 90, replace = TRUE)) c8 = c(c8,sample(20:75, 90, replace = TRUE)) c13 = c(c13,sample(1:40, 90, replace = TRUE)) c14 = c(c14,sample(20:75, 90, replace = TRUE)) exampleInput = data.frame(V1 = c1, V2 = c2, V3 = c3, V4 = c4, V5 = as.numeric(c5), V6 = as.numeric(c6), V7 = as.numeric(c7), V8 = as.numeric(c8), V9 = c9, V10 = c10, V11 = as.numeric(c11), V12 = as.numeric(c12), V13 = as.numeric(c13), V14 = as.numeric(c14), V15 = c15) file2 = tempfile() write.table(exampleInput, file = file2, quote = FALSE, row.names = FALSE, sep = "\t", col.names = FALSE) # Set up the sample table. ---- sampleTabler1 = c(file, "s", "1", "s1") sampleTabler2 = c(file2, "c", "1", "c1") # make the sample table sampleTable2 = rbind.data.frame(sampleTabler1, sampleTabler2) # add the column names colnames(sampleTable2) = c("file", "group", "sample", "sampleName") rnaCrosslinkQC(sampleTable2,tempdir())
Extract the rna ID for the instance
rnas(x)
rnas(x)
x |
A rnaCrosslinkDataSet object |
A character - the ID of the RNA
cds = makeExamplernaCrosslinkDataSet() rnas(cds)
cds = makeExamplernaCrosslinkDataSet() rnas(cds)
Extract the size of the RNA for the instance
rnaSize(x)
rnaSize(x)
x |
A rnaCrosslinkDataSet object |
A numeric - the size of the RNA (nucleotides)
cds = makeExamplernaCrosslinkDataSet() rnaSize(cds)
cds = makeExamplernaCrosslinkDataSet() rnaSize(cds)
This function samples chimeras into smaller chunks so that clustering is quicker
sampleChimeras(chimeraList)
sampleChimeras(chimeraList)
chimeraList |
list of chimeras |
Extract the sample names for the instance
sampleNames(x)
sampleNames(x)
x |
A rnaCrosslinkDataSet object |
A character vector - the sample names
cds = makeExamplernaCrosslinkDataSet() sampleNames(cds)
cds = makeExamplernaCrosslinkDataSet() sampleNames(cds)
Extract the sample table for the instance
sampleTable(x)
sampleTable(x)
x |
A rnaCrosslinkDataSet object |
A data frame - The orginal meta-data table
cds = makeExamplernaCrosslinkDataSet() sampleTable(cds)
cds = makeExamplernaCrosslinkDataSet() sampleTable(cds)
Subset a list of Input files
subsetInputList2(InputList, min, max, length)
subsetInputList2(InputList, min, max, length)
InputList |
the original InputList created with readInputFiles |
min |
the rna of interest that you want to subset |
max |
The number of randomly subsetted chimeric reads you need |
length |
The number of randomly subsetted chimeric reads you need |
Function used to subset a list of Input data created by readInputFiles This function produces the same size list as before but it returns ONLY the rna of interest and also Choose duplexes where the nt difference in position between the one side and other side of an interaction is between min and max
A list of subsetted Input files
Swap the table to ensure that 3 prime most duplex side is on he left of the table used to make one sides heatmaps and other reasons where having the left of the table coming after the right side is a problem. Different from swapInputs as it ensure that BOTH duplex sides originate from the RNA of interest.
swapInputs(InputList, rna)
swapInputs(InputList, rna)
InputList |
the original InputList created with readInputFiles or subsetInputList |
rna |
The rna of interest |
A list of "swapped" Input datas
Swap the table to ensure that 3 prime most duplex side is on the left of the table used to make one sides heatmaps and other reasons where having the left of the table coming after the right side is a problem. Different from swapInputs as it ensure that BOTH duplex sides originate from the RNA of interest.
swapInputs2(InputList, rna)
swapInputs2(InputList, rna)
InputList |
the original InputList created with readInputFiles or subsetInputList |
rna |
The rna of interest |
A list of "swapped" Input data
Swap the table to ensure that 3 prime most duplex side is ont he left of the table used to make one sides heatmaps and other reasons where having the left of the table coming after the right side is a problem. Different from swapInputs as it ensure that BOTH duplex sides originate from the RNA of interest.
swapInputs3(InputList, rna)
swapInputs3(InputList, rna)
InputList |
the original InputList created with readInputFiles or subsetInputList |
rna |
The rna of interest |
A list of "swapped" Input datas
This method prints the top transcripts that have the most duplexes assigned that interact with the transcript of interest
topInteracters(cds, ntop = 10, sds = TRUE)
topInteracters(cds, ntop = 10, sds = TRUE)
cds |
a |
ntop |
the number of entries to display |
sds |
known bug, doesn't work for small data sets fix incoming |
A table, the number of counts per sample per interacting transcript
cds = makeExamplernaCrosslinkDataSet() topInteracters(cds, sds = TRUE)
cds = makeExamplernaCrosslinkDataSet() topInteracters(cds, sds = TRUE)
This method prints the top transcript interactions that have the most duplexes assigned
topInteractions(cds, ntop = 10)
topInteractions(cds, ntop = 10)
cds |
a |
ntop |
the number of entries to display |
A table, the number of counts per sample per interaction
cds = makeExamplernaCrosslinkDataSet() topInteractions(cds)
cds = makeExamplernaCrosslinkDataSet() topInteractions(cds)
This method prints the top transcripts that have the most duplexes assigned
topTranscripts(cds, ntop = 10)
topTranscripts(cds, ntop = 10)
cds |
a |
ntop |
the number of entries to display |
A table, the number of counts per sample per transcript
cds = makeExamplernaCrosslinkDataSet() topTranscripts(cds)
cds = makeExamplernaCrosslinkDataSet() topTranscripts(cds)
Trimming of the clusters removes redundant information derived from random
fragmentation of the reads during library preparation. This method takes
a rnaCrosslinkDataSet
object where clustering has been performed with
the clusterrnaCrosslink method and trims the clusters according to the
trimFactor argument.
trimClusters(clusteredCds, trimFactor = 2.5, clusterCutoff = 1)
trimClusters(clusteredCds, trimFactor = 2.5, clusterCutoff = 1)
clusteredCds |
a |
trimFactor |
a positive value that defines how much the clusters will |
clusterCutoff |
Minimum number of reads before discarding cluster be trimmed = mean + ( sd * trimFactor ) |
The 3 attributes; matrixList, clusterTableList and clusterGrangesList
will gain the types
"superClusters" and "trimmedClusters"
Returns a rnaCrosslinkDataSet
object
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) trimClusters(clusteredCds = clusteredCds, trimFactor = 1, clusterCutoff = 0)
cds = makeExamplernaCrosslinkDataSet() clusteredCds = clusterrnaCrosslink(cds, cores = 1, stepCount = 1, clusterCutoff = 0) trimClusters(clusteredCds = clusteredCds, trimFactor = 1, clusterCutoff = 0)