Rho-independent terminator Evaluation

Transcription termination in bacteria is a crucial regulatory step that ensures accurate gene expression. Terminator sequences can be broadly classified into Rho-dependent and Rho-independent terminators. While Rho-dependent terminators require the Rho protein to facilitate RNA polymerase dissociation, Rho-independent terminators function autonomously, relying solely on intrinsic sequence features.

A Rho-independent terminator is typically characterised by a GC-rich stem-loop structure flanked by short poly-A and poly-U tracts. The upstream poly-A tract can stabilise the formation of the stem-loop, whereas the downstream poly-U tract facilitates RNA polymerase release. The stem-loop itself consists of a paired stem formed by complementary base-pairing and a loop connecting the arms, creating a physical barrier that destabilises the RNA–DNA hybrid and promotes termination.

To computationally explore terminator features, it is possible to generate random sequences that mimic Rho-independent terminators while preserving their structural elements, such as the stem-loop and flanking poly-A/poly-U tracts. These synthetic sequences were used as training data for machine learning models. We employ XGBoost, a gradient boosting algorithm, to learn patterns from randomly generated Rho-independent terminator sequences, with the goal of predicting terminator activity and understanding the sequence-structure determinants of transcription termination.