Board Thread:Fossil Fuels/@comment-1259419-20160119225225/@comment-1259419-20160124220635

You are wrong. Those peptide pieces were part of a large protein. They fell apart during the Mass Spectrometry. You have to do a BLAST on the ENTIRE protein. BLAST searches for sequences of only 8-10 aminoacids long are useless.

Those peptides are loose pieces because of the analysis method used. The plots show that the pieces came from ONE species-specific protein. Otherwise there would be multiple versions of the same peaks in the graph (compare with ostrich).

So, if you do a BLAST on all the different peptide pieces, and BLAST gives you a different species for each peptide, you should know you are doing something wrong.

So, if you have multiple pieces of the same protein you do NOT BLAST them piece by piece because a sequence with length 10 (which most of these pieces are) can look similar to losts of things in the entire genebank. You need a much longer sequence to get trustworthy results.

If you BLAST this piece:

SYELPDGQVITIGNER

In the top of your BLAST hits are human and the fish Coregonus lavaretus. However, if you combine with piece with another of the discovered peptide pieces like this:

YVALDFEQEMATAASSSSLESSYELPDGQVITIGNER

Your top result is the butterfly Micromelalopha troglodyta. That shows how risky it is to use tiny pieces of sequence.

They added all pieces together (see figure 3). This is what they found:

the actin sequence matched with the following: Archosauria + Testudines (100%), vertebrates (100%), fungi (92%), and cellular slime molds (88%);

I have tried to do this on my own. This is the sequence in figure 3:

XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXM LMSILPQPAM DEEIAALVVD NGSGXCKAGF AGDDAPRAVF PSXXXXXXXX XXXXXVGMGQ KDSYVGDEAQ SKXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXVAPE EHPILLXXXX XXXXXXXXXX XXXXXXXXXX XXXXDXXXXL TDYLMKXXXX XGYSFTTTAE EXXXXDIKEK LCXXXYVALD FEQEMATAAS SSSLESSYEL PDGQVITIGN XXXXXXXXXX ERFRCPEALF QPSXXXXXXX XXXXXXXXXX XXXXXXXXXD LYANTVLSGG TTMYPGIADR MQEEITALAP STMKXXIIAP PERKYSVWIG GSILASLSTF QQMWISKQEY DESGPSIVHR KXX

In the results I do not see 100% identity with any sequences. Weaking the case for extant contamination. Among the top 15 hit I see a lot of birds and alligator (I do not know how relyable the PREDICTED sequences are); but also mammals.

I think we just do not understand how to use BLAST fully. If they made up the BLAST results, the reviewers should have seen that easily. And otherwise any critical reader would see it. Then the carreer of all these authors would be in jeopardy.

I'll try to read the paper more thorough once I have the time.