Privatre:bioinfomatics
		
		
		
		Jump to navigation
		Jump to search
		
Bioinfomatics
A combined technologies with biology, computer science, mathmatics and statistics. [1]
Bioinfomatics workflow steps
- quality control assessmemt steps
 - sequence alignment
 - data summarization into genes/regions
 - data annotation to genomics features
 - statistical comparisons
 - mutltiomic ingetration
 
Bioinfomatics curated software list[2]
- Package suites
 - Data Tools
- Downloading
 - Compressing
 
 - Data Processing
- Command Line Utilities
 
 - Next Generation Sequencing
- Workflow Managers
 - Pipelines
 - Sequence Processing
 - Data Analysis
 - Sequence Alignment
- Pairwise
 - Multiple Sequence Alignment
 - Clustering
 
 - Quantification
 - Variant Calling
- Structural variant callers
 
 - BAM File Utilities
 - VCF File Utilities
 - GFF BED File Utilities
 - Variant Simulation
 - Variant Prediction/Annotation
 - Python Modules
- Data
 - Tools
 
 - Assembly
 - Annotation
 
 - Long-read sequencing
- Long-read Assembly
 
 - Visualization
- Genome Browsers / Gene Diagrams
 - Circos Related
 
 - Database Access
 - Resources
- Becoming a Bioinformatician
 - Bioinformatics on GitHub
 - Sequencing
 - RNA-Seq
 - ChIP-Seq
 - YouTube Channels and Playlists
 - Blogs
 - Miscellaneous
 
 
- Online networking groups
 
File format in Bioinfomatics
This section explains some of the commonly used file formats in bioinformatics[3]
| File formats | File extensions | 
|---|---|
| FASTA | .fa, .fasta, .fsa | 
| FASTQ | .fastq, .sanfastq, .fq | 
| SAM
 (Sequence Alignment Map)  | 
file.sam | 
| BAM | file.bam | 
| VCF
 (Variant Calling Format/File)  | 
file.vcf | 
| GFF
 (General Feature Format or Gene Finding Format)  | 
file.gff2, file. gff3, file.gff | 
| GTF
 (Gene Transfer format)  | 
file.gtf | 
Usufull Tutorial Link
- https://vcru.wisc.edu/simonlab/bioinformatics/programs/
 - https://github.com/danielecook/Awesome-Bioinformatics
 - https://mybiosoftware.com/
 - https://bioinformatics.uconn.edu/resources-and-events/tutorials-2/
 - Cluster system software modules - https://bioinformatics.uconn.edu/cbc_software/software-2/
 
We can use BIOConda [4]
Bioconda only supports python 2.7, 3.6, 3.7, 3.8 and 3.9 -> DLS38 can be used
Lib and sources
| Libraries | Description | References | 
|---|---|---|
| rSeq: RNA-Seq Analyzer | https://jhui2014.github.io/rseq/ | On 61 sever, /test/bioinfomatics/rseq/rseq-0.2.2-src | 
| SNVMix2 | https://github.com/shahcompbio/snvmix | imp@CGX-GPU:~/test/bioinfomatics/snvmix (master)$ | 
| Samtools | SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments | |
| Breakdancer | BreakDancer uses CMake which is a cross-platform build tool. Basically it will generate a Makefile so you can use make. The requirements are the zlib, development library, gcc, gmake, cmake 2.8+. Beginning with version 1.4.4, BreakDancer includes samtools as part of the build process
  | 
|
| BWA | BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. |