Michael CooleyContact


I was born in 1950. By every standard (except my own), I'm an old man. And, invariably, old men have accumulated a number of biases. In respect to this project, these are mine:

I look forward to accumulating additional universally-renowned biases.

Genetic genealogists who wish to report on any number of aspects of the data first need access to the data, not just pictures of the data. Indeed, the little I'm presenting here so far is openly available. I've merely reformatted it according to the above principles. More to come.


GENERIC Ybrowse.org DATA

This Y-SNP database was created from several files at isogg.org and ybrowse.org, including BYSNPindex.xlsx, FTSNPindex.xlsx, snps_hg38.vcf.gz, and others. I've reformatted them (my only contribution) into an easy-to-parse, space-delimited flat file having four fields: pos(ition), anc(estral value), der(ived value) and SNP names. Multiple names are separated by a comma with no following space. This is a sample line.

2789173 G C F4532,SK1916,Y525

Here's a very simple perl script for parsing that line of data. If you have perl installed on your computer, save the following to a file name of your choice and chmod 755.

#!/usr/bin/perl

$data = "2789173 G C F4532,SK1916,Y525";
($pos,$anc,$der,$list) = split / /,$data;
@names = split /,/, $list;

print "Position is $pos\n";
print "Ancestral data is $anc\n";
print "Derived value is $der\n";
print "SNP names: ";

foreach $name (@names) { print "$name " }

print "\n";

It's likely that none other than a programmer will be interested in this. And coders working with Y-SNPs probably already have the data in one form or another. Nevertheless, here it sits. Feel free to contact me if the file has gotten to be too old.

The file will unzip with the extension .db. Don't let that fool you. It's a simple, readable text file. Change the extension to anything you want — .txt, .csv, etc.:  Download ZIP file.1


TOOLS

BAM REPORTS

Except for the junkiest of reads, all SNP mutations are reported by quantity and quality and, if named, all variant names. Otherwise, they're marked as novel variants. This report can be used to check the validity of a SNP without the bother of clunky chromosome browsers and the learning curve sometimes required to understand them.

An example can be found at http://ysnp.info/public/1142-SNP-Report.txt. There's no phylogeny included. The arrangement of these markers into a tree is a separate study.

Because of restrictions at my content provider, I run BAMs on my personal computer. I'd be happy to do that for anyone interested in having such a report. Click on the Contact link at the top of http://ynsp.info and provide a link to your BAM at dropbox.com, etc. Please preserve the original file name.

Some caveats:

Again, send a message through the ysnp.info contact form and provide the link for the file hosting service to the zipped BAM. I'll download it and send a timely response.




1 Size is [an error occurred while processing this directive] and will upzip to about three times.
2 The file can be downloaded from http://ybrowse.org/gbrowse2/gff/hg38ChrY.fa