Unsupervised machine learning methods such as hierarchical clustering allow us to discover the trends and patterns of similarity within the data. Here, I demonstrate by using a test data, how to apply the Hierarchical clustering on columns of a test data matrix. Note that as my main focus is Bioinformatics application, I assume that the columns of the matrix represent individual samples and the rows represent the genes or transcripts or some other biological feature. However, as the application of clustering algorithms are not restricted to biology the rows or the column of the matrix may represent other things based on the field of research ! For the distance metric, I will use the Spearman correlation based distance supported by the Dist function of amap package.

Note ! the & sign is to run the command in background.

Getting MD5 sum for all files and writing it to a txt file in Linux.

md5sum * > myChecklist.txt &

Getting MD5 sum for all files and subfolders and writing it to a txt file in Linux.

find ./ -type f -exec md5sum {} + > myChecklist.txt &

Getting MD5 sum for all files and writing it to a txt file in Mac.

md5 -r * > myChecklist.txt &

Getting MD5 sum for all files and subfolders and writing it to a txt file in Mac.
Labels
Blog Archive
About Me
About Me
My Photo
I am a Postdoc researcher at the Neuromuscular Disorders Research lab and Genetic Determinants of Osteoporosis Research lab, in University of Helsinki and Folkhälsan RC. I specialize in Bioinformatics. I am interested in Machine learning and multi-omics data analysis. My go-to programming language is R.
My Blog List
My Blog List
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.