maccu.ExpressionReader
This class defines a main method. If the main method is called, it reads a probe expression file, which should contain RMA normalized array data, computes pearson correlations between every two probes, and stores correlation values above a specified threshold. Stored correlation values could be re-used for saving computation time.
This class also defines two constructors, one requires a parameter, i.e., a probe expression file, and the other requires a probe expression file and a file storing pre-computed correlation values. The public method getCorrel will spend effort on computing pearson correlations if the object is created by the first constructor; otherwise, getCorrel will lookup pre-computed values for pearson correlations.
Parameters
- -I: the probe expression file, which is tab-delimited and outputted by RMA package of R
- -O: output filename, stores correlation values
- -NF: number filter NF, a correlation value is computed if two probes are commonly present in at least NF slides
- -CF: correlation filter CF, a correlation value is stored if it is above (or equal to) CF
- -assign: boolean with default value FALSE, indicating whether or not the probe expression file is with naming information
maccu.RelationComputer
This class reads expression data using ExpressionReader and computes relationships between every two genes of the given gene list, which would be composed of loci or probe names. After that, this class produce a .graph file and a .dot file for further processing. The .graph file describes a graph with nodes of loci.
Parameters
- -I: the probe expression file, which contains tab-delimited expression levels (ex: outputted by RMA package of R)
- -assign: name assignment of the probe expression file. See also Name assignment of probes.
- -C: the file storing precomputed correlation values, optional
- -O: prefix of output filenames
- -P: the specified gene list, it could be a list of loci, or a list of probe names
- -L: boolean value, defines whether the gene list (defined by -P) is composed of loci (TRUE) or probe names (FALSE). Defaultly TRUE.
- -NF: number filter NF, a correlation value is computed if two probes are commonly present in at least NF slides. This parameter will not be effective if -C is specified.
- -CF: correlation filter CF, a correlation value is stored if it is above (or equal to) CF
Filtering / mapping
Sometimes probe names (or locus names) in the gene list file (specified by the -P option) will be filtered, and a name is filtered if it is not in the name assignment (specified by the -assign option). For microarray experiments, this should have been because of different annotation of probes.
Mapped probe names (or locus names) will be saved in a .mapped file.
maccu.GraphAdjust
This class reads a .graph file (produced by RelationComputer in most cases) and performs specified graph-level operation. Then it reports the resulting graph in <prefix>-all.graph file and all connected components in <prefix>-X.graph files, where <prefix> is defined by the -O option and X means the serial number of a connected component. This class also outputs a simple .tree file for the GOBU program, where annotations could be attached to this tree using the AddAnnotation class.
Parameters
- -I: the input .graph file
- -O: prefix of output filenames
- -G: generate all <prefix>-X.graph files. Defaultly FALSE.
- -RN: remove node <x>
- -RE <x> <y>: remove the edge between <x> and <y>
- -ND: node decomposition, decompose a node of multi loci into multiple nodes. Defaultly FALSE.
- -remove: the graph (in .graph file) for the remove operation
- -retain: the graph (in .graph file) for the retain operation
- -D <x>: degree filter, nodes with less than or euqal to <x> neighbors will be removed.
- -S <x>: cluster-size filter, clusters of sizes less than or equal to <x> will be removed.
Options -RN -RE -remove and -retain can be multiply applied.
Operation order
This program fixes the input graph in the following order: (1) removing nodes (-RN), (2) removing edges (-RE), (3) node decomposition, (4) removing graphs (-remove), (5) retaining graphs (-retain), (6) detree filtering (-D), and (7) cluster-size filtering (-S).
This class reads a .graph-format file, do some "make-up", and output a .dot file for the graphviz program. This class can color nodes according to (1) intensity fold-changes, or (2) specified color list. This class can also give specified labels to nodes.
Parameters
- -I: the input .graph file, whose nodes are loci
- -O: the output .dot file
- -label: the label file. In this file, each line gives the label (second token, trimmed) of the locus (first token). See also Labels of loci.
- -fold: the file defines intensity fold-changes, from -4 (green, most down regulated) to 0 (grey) to 4 (red, most up regulated). See also Fold-change assignment.
- -C <X> <Y>: specify color <X> to nodes in .graph-format file <Y>. This parameter can be applied multiple times. Note that a former color assignment has a higher priority than a latter color assignment. Also note that this parameter will not be effective if -fold is used.
- -font1: specified font size of labels, which is usually lager than -font2
- -font2: specified font size of loci.
Possible problem for the -label option
The naming file specified by the -label option gives a name for a locus. If there is some node with only locus accessions or with the name "null", it means that the corresponding loci are not defined in the naming file.
Possible problem for the -fold option
The fold-change file specified by the -fold option gives fold-changes of probes, where a probe can be exactly a locus accession or a composition of locus names, like "AT5G04860;AT2G10560". If there is some node not colored when using the -fold option, it is possible that (1) it is not included in the fold-change file, or (2) the corresponding node names in the fold-change file and that in the name assignment of probes are not consistent, e.g. "AT5G04860;AT2G10560" in one file and "AT2G10560;AT5G04860" in another file. (fixed after version 0.55)
Color suggestion
If the -C option is applied, the following colors are recommended:
- pink
- cyan
- yellowgreen
- yellow
- red
- chocolate
- green
- purple
For other color names, please refer X11 color scheme.
twopi -Goutputorder=edgesfirst -Goverlap=vpsc -Granksep=2 -Gratio=auto -Tpng -oOutput.png Input.dot
twopi is better for drawing graphs with more nodes, but some edges would be 'hidden'.
neato -Goverlap=false -Gsplines=true -Gsep=.1 -Tpng -oOutput.png Input.dot
neato is better for drawing graphs with fewer nodes, but would cost more time on computation.
maccu.CoExpressFishing
Given a bait set, we find edges connecting bait genes and non-bait genes. Let us call thus found non-bait genes fished genes. By including fished genes, we extend the bait gene list, and thus we can do the fishing again and again, till there is no newly fished genes.
Parameters
- -I: the probe expression file
- -assign: name assignment of the probe expression file. See also Name assignment of probes.
- -C: the file storing precomputed correlation values. Note that the fishing operation would be much faster if this parameter is assigned.
- -O: prefix of output filenames
- -B: the bait file, defines the bait set. Note that a .graph file could be assigned to this option.
- -STEP: defines the number of aforementioned fishing operations
- -BUFFER: defines whether or not to search correlation edges between fished nodes
- -NF: number filter NF, a correlation value is computed if two probes are commonly present in at least NF slides. This parameter will be not effective if -C is specified.
- -CF: correlation filter CF, a correlation value is stored if it is above CF
Filtering / mapping
Sometimes locus names in the bait file (specified by the -B option) will be filtered, and a name will be filtered if it is not in the name assignment (specified by the -assign option). For microarray experiments, this should have been because of different annotation of probes.
Mapped locus names will be saved in a .mapped file.
util.Utility008
Translate a .graph file into a PowerPoint file. See here for a detailed description.