RNA plays essential roles in numerous cellular processes including gene regulation and DNA replication. Such roles are known to be dictated by higher order structures of RNA molecule. Hence it is of prime importance to find an RNA sequence that can fold to have a particular function desired for use in pharmaceuticals. The challenge of finding an RNA sequence for a given structure is known as RNA design problem. Although there are some well-known algorithms to solve the problem, they only use hard constraint to evaluate the predicted sequences. Recently, SHAPE data has emerged as a soft constraint for improving RNA secondary structure algorithms. It motivates us to report a new method for accurate design of RNA sequences based on their secondary structures using SHAPE data as pseudo-free energy. We first use a stochastic model to simulate SHAPE data based on a set of 16S and 23S ribosomal RNA sequences. Then, a harmony search algorithm is applied to accurately predict RNA sequence using free energy and simulated SHAPE data.
We compere our algorithm with some well-known algorithms. Our algorithm precisely predicts 26 new sequences for the extracted structure from Rfam dataset while the other algorithms predict 22 out of 29. The proposed algorithm is comparable to them on RNA-SSD datasets where they can predict 33 appropriate sequences for RNA secondary structures (out of 34). Finally, we show that the predicted 3D structures of designed sequences with SHAPE data are more similar to nature.
You may download our dataset from here.
Also, you can download HRDSSD-Suk and HRDWSD.
HRDSSD | HRDSSD-suko | HRDWSD | ERD | MODENA | INFO-RNA | RNAifold 2 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Rfam AC | length | SC | time | SC | time | SC | time | SC | time | SC | time | SC | time | SC | time |
RF00001 | 117 | 50 | 1.94 | 50 | 10.12 | 50 | 2.18 | 50 | 0.92 | 28 | 0.79 | 49 | 0.15 | 27 | 8.52 |
RF00002 | 151 | 50 | 1.12 | 50 | 4.14 | 50 | 1.64 | 49 | 2.98 | 13 | 2.00 | 0 | - | 3 | 60.03 |
RF00003 | 161 | 50 | 2.00 | 50 | 8.20 | 23 | 17.00 | 41 | 5.94 | 11 | 2.18 | 0 | - | 0 | - |
RF00004 | 193 | 50 | 0.94 | 50 | 2.66 | 50 | 0.58 | 50 | 2.51 | 48 | 0.77 | 19 | 2.73 | 46 | 0.35 |
RF00005 | 74 | 50 | 0.28 | 50 | 0.66 | 50 | 0.30 | 50 | 0.03 | 35 | 0.37 | 49 | 0.04 | 47 | 0.08 |
RF00006 | 89 | 50 | 0.38 | 50 | 0.96 | 50 | 0.44 | 50 | 0.17 | 44 | 0.34 | 40 | 0.06 | 50 | 0.10 |
RF00007 | 154 | 50 | 0.86 | 50 | 2.96 | 50 | 0.82 | 50 | 0.71 | 48 | 0.58 | 42 | 0.30 | 44 | 0.40 |
RF00008 | 54 | 50 | 0.22 | 50 | 0.60 | 50 | 0.24 | 50 | 0.01 | 46 | 0.24 | 50 | 0.00 | 50 | 0.06 |
RF00009 | 348 | 50 | 9.50 | 50 | 32.04 | 49 | 11.73 | 49 | 22.52 | 39 | 3.13 | 0 | - | 8 | 1.27 |
RF00010 | 357 | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - |
RF00011 | 382 | 1 | 282.00 | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - |
RF00012 | 215 | 50 | 1.02 | 50 | 3.50 | 50 | 0.82 | 50 | 1.52 | 47 | 0.96 | 3 | 23.24 | 50 | 0.41 |
RF00013 | 185 | 50 | 0.96 | 50 | 3.20 | 50 | 0.70 | 50 | 1.42 | 38 | 1.13 | 13 | 2.00 | 50 | 0.42 |
RF00014 | 87 | 50 | 0.26 | 50 | 0.80 | 50 | 0.26 | 50 | 0.04 | 40 | 0.40 | 50 | 0.02 | 50 | 0.12 |
RF00015 | 140 | 50 | 0.54 | 50 | 2.08 | 50 | 0.58 | 50 | 0.86 | 44 | 0.57 | 19 | 1.74 | 47 | 0.33 |
RF00016 | 129 | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - |
RF00017 | 301 | 50 | 2.18 | 50 | 10.00 | 50 | 1.14 | 50 | 2.13 | 42 | 2.76 | 48 | 0.80 | 48 | 2.49 |
RF00018 | 360 | 50 | 10.12 | 50 | 42.46 | 50 | 7.04 | 0 | - | 0 | - | 0 | - | 0 | - |
RF00019 | 83 | 50 | 0.36 | 50 | 1.04 | 50 | 0.30 | 50 | 0.09 | 46 | 0.33 | 47 | 0.03 | 50 | 0.10 |
RF00020 | 119 | 50 | 0.54 | 50 | 1.80 | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - |
RF00021 | 118 | 50 | 0.28 | 50 | 0.74 | 50 | 0.26 | 50 | 0.13 | 40 | 0.63 | 50 | 0.09 | 50 | 0.58 |
RF00022 | 148 | 50 | 0.50 | 50 | 1.44 | 50 | 0.48 | 50 | 1.02 | 44 | 0.68 | 8 | 1.05 | 42 | 0.21 |
RF00024 | 451 | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - |
RF00025 | 210 | 50 | 1.06 | 50 | 3.50 | 50 | 0.68 | 50 | 3.36 | 45 | 0.87 | 1 | 0.93 | 50 | 0.33 |
RF00026 | 102 | 50 | 0.24 | 50 | 0.54 | 50 | 0.22 | 50 | 0.02 | 45 | 0.36 | 2 | 3.11 | 50 | 0.06 |
RF00027 | 79 | 50 | 0.28 | 50 | 0.64 | 50 | 0.28 | 50 | 0.03 | 49 | 0.35 | 50 | 0.04 | 50 | 0.21 |
RF00028 | 344 | 46 | 87.33 | 41 | 174.54 | 45 | 17.67 | 50 | 26.42 | 0 | - | 0 | - | 1 | 1.54 |
RF00029 | 73 | 50 | 0.34 | 50 | 0.88 | 50 | 0.40 | 50 | 0.06 | 39 | 0.28 | 3 | 0.01 | 50 | 0.08 |
RF00030 | 340 | 50 | 7.98 | 50 | 28.74 | 50 | 5.94 | 0 | - | 44 | 2.34 | 0 | - | 26 | 5.55 |
sum | 1247 | 413.23 | 1241 | 338.24 | 1167 | 71.70 | 1089 | 72.88 | 875 | 22.05 | 543 | 36.32 | 889 | 83.25 |
HRDSSD | HRDSSD-suko | HRDWSD | ERD | MODENA | INFO-RNA | RNAifold 2 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Id | length | SC | time | SC | time | SC | time | SC | time | SC | time | SC | time | SC | time |
AB015827 | 857 | 50 | 152.66 | 49 | 223.02 | 50 | 13.32 | 50 | 7.15 | 45 | 20.64 | 37 | 11.68 | 39 | 40.66 |
AF029195 | 1054 | 50 | 525.42 | 48 | 354.40 | 50 | 29.66 | 50 | 11.58 | 0 | - | 47 | 32.67 | 43 | 2.15 |
AF056938 | 1399 | 50 | 841.30 | 38 | 1738.50 | 50 | 30.08 | 50 | 43.85 | 0 | - | 43 | 112.64 | 0 | - |
AF096836 | 647 | 50 | 22.76 | 49 | 118.06 | 50 | 4.48 | 50 | 4.32 | 0 | - | 43 | 3.40 | 35 | 16.13 |
AF106618 | 351 | 50 | 5.18 | 50 | 24.62 | 50 | 1.54 | 50 | 1.84 | 41 | 3.56 | 47 | 0.58 | 49 | 4.58 |
AF107506 | 338 | 50 | 3.76 | 50 | 24.60 | 50 | 1.78 | 50 | 2.38 | 39 | 3.59 | 44 | 2.04 | 1 | 2.19 |
AF141485 | 474 | 50 | 10.64 | 50 | 78.30 | 50 | 3.32 | 50 | 3.42 | 45 | 5.93 | 33 | 4.61 | 0 | - |
AJ011149 | 377 | 50 | 12.38 | 50 | 41.38 | 48 | 6.85 | 50 | 1.43 | 0 | - | 37 | 1.72 | 22 | 2.90 |
AJ130779 | 507 | 50 | 13.58 | 49 | 109.71 | 50 | 3.98 | 50 | 2.41 | 47 | 6.19 | 49 | 1.72 | 19 | 9.87 |
AJ132572 | 781 | 50 | 103.62 | 38 | 312.26 | 50 | 9.98 | 50 | 7.12 | 0 | - | 42 | 12.90 | 22 | 18.87 |
AJ133622 | 1297 | 1 | 599.00 | 34 | 62.06 | 26 | 176.23 | 50 | 18.79 | 0 | - | 49 | 233.92 | 45 | 289.25 |
AJ236455 | 752 | 6 | 314.17 | 2 | 341.50 | 11 | 136.18 | 50 | 18.22 | 0 | - | 5 | 255.65 | 0 | - |
D38777 | 859 | 50 | 162.08 | 20 | 342.20 | 50 | 17.08 | 50 | 14.14 | 0 | - | 42 | 164.69 | 0 | - |
L11935 | 265 | 50 | 1.92 | 50 | 7.04 | 50 | 0.80 | 50 | 0.78 | 49 | 1.51 | 49 | 0.81 | 50 | 1.21 |
L77117 | 1476 | 2 | 298.50 | 4 | 238.00 | 4 | 238.00 | 50 | 23.63 | 0 | - | 50 | 55.28 | 0 | - |
LIU92530 | 290 | 7 | 90.57 | 4 | 191.00 | 15 | 62.73 | 50 | 1.02 | 0 | - | 22 | 1.26 | 43 | 4.08 |
S70838 | 390 | 46 | 50.57 | 46 | 89.80 | 48 | 7.38 | 50 | 2.99 | 0 | - | 46 | 2.96 | 49 | 18.11 |
U63350 | 419 | 50 | 5.14 | 50 | 26.06 | 50 | 1.76 | 50 | 1.54 | 49 | 4.14 | 48 | 3.49 | 44 | 4.73 |
U81771 | 492 | 50 | 7.80 | 50 | 57.76 | 50 | 2.28 | 50 | 1.99 | 0 | - | 46 | 2.71 | 0 | - |
U84629 | 300 | 48 | 19.46 | 46 | 53.48 | 49 | 4.55 | 50 | 0.97 | 5 | 19.40 | 40 | 2.40 | 50 | 2.43 |
X61771 | 660 | 49 | 86.18 | 36 | 383.28 | 49 | 92.47 | 50 | 13.19 | 0 | - | 17 | 17.09 | 0 | - |
X81949 | 1201 | 18 | 340.50 | 39 | 82.00 | 47 | 206.45 | 50 | 17.76 | 0 | - | 48 | 128.07 | 31 | 19.23 |
X99676 | 1443 | 2 | 544.50 | 0 | - | 4 | 40.50 | 50 | 31.17 | 0 | - | 46 | 220.37 | 0 | - |
Z83250 | 261 | 50 | 6.10 | 50 | 3.70 | 50 | 1.72 | 50 | 0.68 | 35 | 2.20 | 50 | 0.74 | 45 | 1.85 |
sum | 929 | 3868.72 | 902 | 4812.93 | 1001 | 1085.75 | 1200 | 232.35 | 355 | 67.17 | 980 | 1273.41 | 587 | 438.22 |
HRDSSD | HRDSSD-suko | HRDWSD | ERD | MODENA | INFO-RNA | RNAifold 2 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Id | length | SC | time | SC | time | SC | time | SC | time | SC | time | SC | time | SC | time |
1 | 65 | 50 | 0.2 | 50 | 0.72 | 50 | 0.28 | 50 | 0.04 | 39 | 0.31 | 50 | 0.02 | 50 | 0.10 |
2 | 79 | 50 | 0.26 | 50 | 0.7 | 50 | 0.3 | 50 | 0.03 | 46 | 0.35 | 50 | 0.01 | 50 | 0.09 |
3 | 122 | 50 | 1.7 | 50 | 5.44 | 50 | 0.96 | 50 | 0.29 | 6 | 3.33 | 50 | 0.06 | 50 | 0.18 |
4 | 166 | 50 | 0.72 | 50 | 2.26 | 50 | 0.44 | 50 | 0.34 | 50 | 0.86 | 50 | 0.10 | 50 | 0.53 |
5 | 180 | 50 | 5.12 | 50 | 40.58 | 0 | - | 8 | 9.28 | 0 | - | 0 | - | 0 | - |
6 | 314 | 50 | 2.26 | 50 | 9.24 | 50 | 1.26 | 50 | 4.83 | 49 | 1.92 | 11 | 10.95 | 11 | 1.51 |
7 | 340 | 50 | 4.12 | 50 | 12.84 | 50 | 1.56 | 50 | 3.19 | 45 | 2.64 | 14 | 23.02 | 24 | 2.29 |
8 | 372 | 50 | 5.1 | 50 | 16.34 | 50 | 1.66 | 50 | 13.94 | 46 | 3.02 | 4 | 31.17 | 25 | 2.19 |
9 | 376 | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - | 0 | - |
10 | 583 | 50 | 16.26 | 50 | 97.08 | 50 | 2.82 | 50 | 4.39 | 45 | 9.24 | 43 | 8.00 | 12 | 9.68 |
sum | 450 | 35.74 | 450 | 185.2 | 400 | 9.28 | 408 | 36.32 | 326 | 21.68 | 272 | 73.34 | 272 | 16.58 |