-
In R, create a function
Score(), that calculates the score for a consensus string. -
Input:
- An array of starting indexes.
DNAStringSetobject of sequences (for example fileseq_score.fasta).- Motif length.
-
Output:
- The score for the consensus string.
-
In R, create function
NextLeaf()according to the following pseudocode. -
Input:
sAn array of starting indexess = (s1 s2 … st), where t is the number of sequences.tNumber of sequences.kk = n - l + 1, wherenis length of sequences andlis motif length.
-
Output:
sAn array of starting indexes that corresponds to the next leaf in the tree.
NextLeaf(s, t, k)
1 for i ← t to 1
2 if s[i] < k
3 s[i] ← s[i] + 1
4 return s
5 s[i] ← 1
6 return s
-
In R, create a function
BFMotifSearch()according to the following pseudocode. -
Input:
DNADNAStringSetobject of sequences (for example fileseq_motif.fasta).tNumber of sequences.nLength of each sequence.lMotif length.
-
Output:
bestMotifAn array of starting positions for each sequence with the best score for the consensus string.
BFMotifSearch(DNA, t, n, l)
1 s ← (1, 1, ... , 1)
2 bestScore ← Score(s, DNA, l)
3 while forever
4 s ← NextLeaf(s, t, n − l + 1)
5 if Score(s, DNA, l) > bestScore
6 bestScore ← Score(s, DNA, l)
7 bestMotif ← (s1, s2, . . . , st)
8 if s = (1, 1, . . . , 1)
9 return bestMotif
-
In R, create a function
NextVertex()according to the following pseudocode. -
Input:
sAn array of starting indexess = (s1 s2 … st), where t is the number of sequences.iLevel of vertex.tNumber of sequences.kk = n - l + 1, wherenis length of sequences andlis motif length.
-
Output:
sThe next vertex in the tree.- Current level of vertex.
NextVertex(s, i, t, k)
1 if i < t
2 s[i + 1] ← 1
3 return (s, i + 1)
4 else
5 for j ← t to 1
6 if s[j] < k
7 s[j] ← s[j] + 1
8 return (s, j)
9 return (s, 0)
-
In R, create a function
ByPass()according to the following pseudocode. -
Input:
s = (s1 s2 … st); an array of starting indexes, where t is the number of sequencesi; level of vertext; number of DNA sequencesk = n - l + 1, wherenis length of DNA sequences andlis motif length
-
Output:
- the next leaf after a skip of a subtree
- current level of vertex
ByPass(s, i, t, k)
1 for j ← i to 1
2 if s[j] < k
3 s[j] ← s[j] + 1
4 return (s, j)
5 return (s, 0)
-
In R, create a function
BBMotifSearch()according to the following pseudocode. -
Input:
DNADNAStringSetobject of sequences (for example fileseq_motif.fasta).tNumber of sequences.nLength of each sequence.lMotif length.
-
Output:
bestMotifAn array of starting positions for each sequence with the best score for the consensus string.
-
Modify function
Score()to calculate score for the consensus string of the firstisequences ofDNA.
BBMotifSearch(DNA, t, n, l)
1 s ← (1, ... , 1)
2 bestScore ← 0
3 i ← 1
4 while i > 0
5 if i < t
6 optimisticScore ← Score(s, i, DNA, l) + (t - i) * l
7 if optimisticScore < bestScore
8 (s, i) ← ByPass(s, i, t, n - l + 1)
9 else
10 (s, i) ← NextVertex(s, i, t, n − l + 1)
11 else
12 if Score(s, t, DNA, l) > bestScore
13 bestScore ← Score(s, t, DNA, l)
14 bestMotif ← (s1, s2, ... , st)
15 (s, i) ← NextVertex(s, i, t, n − l + 1)
16 return bestMotif
Download files from GitHub
Basic Git settings
- Configure the Git editor
git config --global core.editor notepad- Configure your name and email address
git config --global user.name "Zuzana Nova" git config --global user.email z.nova@vut.cz- Check current settings
git config --global --list
-
Create a fork on your GitHub account. On the GitHub page of this repository find a Fork button in the upper right corner.
-
Clone forked repository from your GitHub page to your computer:
git clone <fork repository address>
-
In a local repository, set new remote for a project repository:
git remote add upstream https://github.com/mpa-prg/exercise_07.gitCreate a new commit and send new changes to your remote repository.
- Add file to a new commit.
git add <file_name>
- Create a new commit, enter commit message, save the file and close it.
git commit
- Send a new commit to your GitHub repository.
git push origin main