BankFasta Class Reference

Implementation of IBank for FASTA format. More...

#include <BankFasta.hpp>

Inheritance diagram for BankFasta:
Inheritance graph


class  Iterator
 Specific Iterator impl for Bank class. More...

Public Member Functions

 BankFasta (const std::string &filename, bool output_fastq=false, bool output_gz=false)
 ~BankFasta ()
std::string getId ()
tools::dp::Iterator< Sequence > * iterator ()
int64_t getNbItems ()
void insert (const Sequence &item)
void flush ()
u_int64_t getSize ()
void estimate (u_int64_t &number, u_int64_t &totalSize, u_int64_t &maxSize)
void finalize ()
- Public Member Functions inherited from AbstractBank
 AbstractBank ()
std::string getIdNb (int i)
int64_t estimateNbItemsBanki (int i)
const std::vector< IBank * > getBanks () const
int64_t estimateNbItems ()
u_int64_t estimateSequencesSize ()
u_int64_t getEstimateThreshold ()
void setEstimateThreshold (u_int64_t nbSeq)
void remove ()
size_t getCompositionNb ()
- Public Member Functions inherited from Iterable< Sequence >
void iterate (Functor f)
virtual Sequence * getItems (Sequence *&buffer)
virtual size_t getItems (Sequence *&buffer, size_t start, size_t nb)
- Public Member Functions inherited from ISmartPointer
virtual ~ISmartPointer ()
- Public Member Functions inherited from Bag< Sequence >
virtual void insert (const Sequence &item)=0
virtual void insert (const std::vector< Sequence > &items, size_t length=0)
virtual void insert (const Sequence *items, size_t length)
- Public Member Functions inherited from SmartPointer
void use ()
void forget ()

Static Public Member Functions

static const char * name ()

Protected Member Functions

void init ()
- Protected Member Functions inherited from SmartPointer
 SmartPointer ()
virtual ~SmartPointer ()

Static Protected Member Functions

static const size_t getMaxNbFiles ()

Protected Attributes

std::vector< std::string > _filenames
FILE * _insertHandle

Detailed Description

Implementation of IBank for FASTA format.

This class provides FASTA management in GATB.

Actually, it provides FASTA and FASTQ formats, both in uncompressed and gzip formats.

In case of FASTQ files, the iterated Sequence objects will provide quality information.

Sample of use (note however that it is better to use Bank::open for opening a bank):

// We declare a Bank instance.
BankFasta b (filename);
// We create an iterator over this bank.
BankFasta::Iterator it (b);
// We loop over sequences.
for (it.first(); !it.isDone();
// In the following, see how we access the current sequence information through
// the -> operator of the iterator
// We dump the data size and the comment
std::cout << "[" << it->getDataSize() << "] " << it->getComment() << std::endl;
// We dump the data
std::cout << it->toString() << std::endl;

Constructor & Destructor Documentation

BankFasta ( const std::string &  filename,
bool  output_fastq = false,
bool  output_gz = false 


[in]filename: uri of the bank.
[in]output_fastq: tells whether the file is in fastq or not.
[in]output_gztells whether the file is gzipped or not
~BankFasta ( )


Member Function Documentation

void estimate ( u_int64_t &  number,
u_int64_t &  totalSize,
u_int64_t &  maxSize 

Give an estimation of sequences information in the bank.

[out]number: sequences number
[out]totalSize: sequences size (in bytes)
[out]maxSize: max size size (in bytes)

Implements IBank.

void finalize ( )

Method that may be called when the bank is done. It is called by BankFasta destructor for instance. It will close fclose() or something equivalent. You don't need to call this function yourself.

Reimplemented from AbstractBank.

void flush ( )

Flush the current content. May be useful for implementation that uses a cache.

Implements Bag< Sequence >.

std::string getId ( )

Implements IBank.

static const size_t getMaxNbFiles ( )
maximum number of files.
int64_t getNbItems ( )

Return the number of items. If a specific implementation doesn't know the value, it should return -1 by convention.

the number of items if known, -1 otherwise.

Implements Iterable< Sequence >.

u_int64_t getSize ( )

Return the size of the bank (comments + data)

The returned value may be an approximation in some case. For instance, if we use a zipped bank, an implementation may be not able to give accurate answer to the size of the original file.

the bank size in bytes.

Implements IBank.

void init ( )

Initialization method (compute the file sizes).

void insert ( const Sequence item)

Insert an item into the bag.

[in]item: the item to be inserted.

Implements IBank.

tools::dp::Iterator<Sequence>* iterator ( )

Create an iterator for the given Iterable instance.

the new iterator.

Implements IBank.

static const char* name ( )

Returns the name of the bank format.

Member Data Documentation

std::vector<std::string> _filenames

List of URI of the banks.

FILE* _insertHandle

File handle for inserting sequences into the bank.

The documentation for this class was generated from the following files: