Tuesday, December 4, 2012

Archaeal COG (arCOG) Sequence Statistics

Sequence length histograms of unique sequences compared with well characterized (wc) COG data.

Note. for wcCOG,

  • Maximum sequence length: 5627
  • Minimum sequence length: 18
  • Mean sequence length: 888
  • Median sequence length: 802






Sequence length histograms of unique sequences compared with well characterized (wc) COG data for sequences with length <= 1000.


Wednesday, September 12, 2012


COG 183362+4872 NW PID Log(1-d^6) with Sammon


Description

DataSet: COG Size: 183362+4872 Unique: Yes* (5 from 4872 overlaps with 183362)
Aligner: NeedlemanWunsch ScoringMatrix: BLOSUM62 GapOpen: -9 GapExt: -1
DistanceType: (1 - PercentIdentity) Transformation: TM12,TP6
Mapping: Sammon DistanceCut: None
Initialization: Random
Fixed: None
Varied: All

Links

Images


Full Sample with Selected Clusters


Full Sample with Selected Clusters Zoomed-in


Full Sample with Selected Clusters and Consensus

Tuesday, May 29, 2012

COG 95672 NW PID Log(1-d^4) with Sammon


Description

DataSet: COG Size: 95672 Unique: Yes
Aligner: NeedlemanWunsch ScoringMatrix: BLOSUM62 GapOpen: -16 GapExt: -4
DistanceType: (1 - PercentIdentity) Transformation: TM12,TP4
Mapping: Sammon DistanceCut: None
Initialization: Random
Fixed: None
Varied: All
DensitySat: 0.85

Links

Images


Full Sample with Selected Clusters




Full Sample with Selected Clusters Zoomed-in


COG 95672 NW PID Log(1-d^2) with Sammon


Description

DataSet: COG Size: 95672 Unique: Yes
Aligner: NeedlemanWunsch ScoringMatrix: BLOSUM62 GapOpen: -16 GapExt: -4
DistanceType: (1 - PercentIdentity) Transformation: TM12,TP2
Mapping: Sammon DistanceCut: None
Initialization: Random
Fixed: None
Varied: All
DensitySat: 0.85

Links

Images


Full Sample with Selected Clusters




Full Sample with Selected Clusters Zoomed-in


Friday, May 25, 2012

COG 95672 NW PID Log(1-d^6) with Sammon


Description

DataSet: COG Size: 95672 Unique: Yes
Aligner: NeedlemanWunsch ScoringMatrix: BLOSUM62 GapOpen: -16 GapExt: -4
DistanceType: (1 - PercentIdentity) Transformation: TM12,TP6
Mapping: Sammon DistanceCut: None
Initialization: Random
Fixed: None
Varied: All
DensitySat: 0.85

Links

Images


Full Sample with Selected Clusters


Full Sample with Selected Clusters Zoomed-in


Full Sample with Selected Clusters Zoomed-in Further




Wednesday, May 23, 2012

COG 95672 NW PID Log(1-d) with Sammon

Description

DataSet: COG Size: 95672 Unique: Yes
Aligner: NeedlemanWunsch ScoringMatrix: BLOSUM62 GapOpen: -16 GapExt: -4
DistanceType: (1 - PercentIdentity) Transformation: TM12
Mapping: Sammon DistanceCut: None
Initialization: Random
Fixed: None
Varied: All
DensitySat: 0.85

Links

Images

 

Full Sample with Selected Clusters



Full Sample with Selected Clusters Zoomed-in

Friday, March 30, 2012

The Role of Seven Clusters

The 7 clusters were chosen early on as interesting ways of looking at value of transformation. They are
COG0444 137 members
COG4608 130 members
COG1131 240 members
COG1126 114 members
COG1136 195 members
COG3842 110 members
COG3849 135 members

We show analysis in terms of
Original distance versus Euclidean 3D map
and
Original Distance for two different methods

The intercluster is collection of all pairs of points inside same cluster and this can measure how well individual clusters are mapped
The intracluster is collection of all pairs -- one in one of seven clusters; the other in another. The quality of these plots measures the relative placement of clusters