## InTrans : a pipeline for integrative transcript library construction ### Copyright (C) 2017 ZhiLiang Ji (appo@xmu.edu.cn) ## Requirement This software is suitable for all unix-like system with python(version 2.7.7) installed.
`One python module was required before usage : configparser3.5.0.`

Moreover, three already published softwares should be correctly installed in advance, and make sure they had been add to your system environment variables. The three softwares are:
(1) IDBA (version 1.1.1)
(2) CD-HIT (version 4.5.4)
(3) CAP3 (version 12/21/07)
of course, for softwares mentioned above, other version is allowed. However, the pipeline operated stably with the recommended version.
## Installation Guide Simply installed by extracting the software package ## Usage In the package folder you extracted, there are three files and one derectory : `"InTrans.py", "run.cfg", "__init__.py" and "test_data"`
(1) "InTrans.py" is the software `executed file`
(2) "run.cfg" is the `configure file`, which contains a series of important parameters. For correctly running with your data, you set the right parameter value in "run.cfg" file. `Detail of these parameters is writed in "run.cfg"`, or if you confused, please see the corresponding software manual. ### Warnning (1) `the default maximun read length of IDBA is 128 bp`, if your read is longer than that, you should change the vaue of 128 to longer one (e.g. 250) in "xx/idba-xxx/src/sequence/short_sequence.h" :
"static const uint32_t kMaxShortSequence = 128;"
->
"static const uint32_t kMaxShortSequence = 250;"
(2) correspondingly, you should also change the default kmer unit to bigger one(e.g. 8) in "xx/idba-xxxsrc/basic/kmer.h":
"static const uint32_t kNumUint64 = 4;"
->
"static const uint32_t kNumUint64 = 8;"
(3) recompile IDBA after modification to make new read length and kmer working
## Running If individual parameter value had been set in "run.cfg" file, then run the pipeline with:
$ `python InTrans.py run.cfg`
For example, you can make a test running with datas in `"test_data"`:
(1) run without heterogeneous data, corresponding configure file is `run_fq.cfg`:
$ `cd ./test_data/`
$ `python ../InTrans.py run_fq.cfg`
(2) run with heterogeneous data, corresponding configure file is `run_fq_heterogen.cfg`:
$ `cd ./test_data/`
$ `python ../InTrans.py run_fq_heterogen.cfg`
## Output Two folders and one log file were generated after the program runs out:
(1) "output" folder
contains the final transcript file, which in fasta format.
(2) "temp_output" folder
contains the temporary file during running, include output of IDBA, CD-HIT, and CAP3.