## How to design a Turing machine which checks whether a number is prime or not? - turing-machines

### Is WW where W belongs to {a,b}* a context free language?

```Is WW where W belongs to {a,b}* a context free language?
If yes, please provide the PDA for it.
```
```No, it is not
Assume, for sake of contradiction, that it is, then there is a PDA that accept it.
According to the pumping lemma (for CFGs), there is a length p such that for every word (we will pick one shortly) s there are some substring u,v,w,x,y such that s=uvwxy and:
|vwx|<=p
|vx|>=1
uv^n wx^n y is in the language for any positive n
Let's consider the word a^p b^p a^p b^p, and such u,v,w,x,y
Either vwx contains the middle of the word, or it's entirely contained in the first half, or it is entirely contained in the second half.
If it's in the first half, then in the word uv^2 wx^2 y. We have added a total length of no more than p, thus we have "moved" the mid-point by no more than p/2, so right now the mid-point continues with b, but the word starts with a a, so it's not of the form ww
Same argument goes for it being in the second half.
Now let's assume it contains the middle, and consider uwy (using n=0). Since |vwx|<=p, then we have removed from the a's and b's in the middle, but not from the a's and b's at the edges. We have also removed a positive amount of letters, so uwy is of the form a^p b^k a^m b^p were either k<p or m<p. Eitherway, it's not of the form of ww```

### PushDown Automaton (PDA) for L={a^(n)b^(n)c^(n)|n>=1}

```I am on a fool's errand trying to construct a Pushdown automaton for the non-context-free language L={a^(n)b^(n)c^(n)|n>=1} and thought of two approaches.
First approach:-
I thought that for every 'a' in string I will push 3 'a' into the stack and for every 'b' in the string, I will pop 2 'a' from the stack now for every 'c' in the string I will still have 1 'a' in the stack.
Problem with the First approach:- the language generated becomes something like this L={a^(p)b^(m)c^(n)| p>=1 and could not determine how m and n can be defined}
Second approach:-
We know that L={ a^(n)b^(m)c^(m)d^(n) | n>=0 } is a context-free language and L={ wxw | w∈(a,b)* } is also context-free language.
So, I thought L={ a^(n)b^(m)b^(m)c^(n) | n>=1 and m=floor((n+1)/2) }
Problem with the Second approach:- don't know if we can calculate floor(n+1/2) in the PDA without disturbing the elements of the stack.
Please help in determining how m and n can be defined in the first approach and how can I find floor((n+1)/2) in the PDA.
JFLAP files available for both if needed.
```
```As Ami Tavory points out, there is no PDA for this language because this language is not context-free. It is easy to recognize this language if you use a queue instead of a stack, use two stacks, or use a Turing machine (all equivalent).
Queue machine:
Enqueue as as long as you see as, until you see a b.
Dequeue as and enqueue bs as long as you see bs, until you see a c
Dequeue bs as long as you see cs.
Accept if you end this process with no additional input and an empty queue.
Two-stack PDA:
Use the first stack to make sure a^n b^n by pushing a when you see an a and popping a when you see a b;
Use the second stack to make sure b^n c^n by pushing b when you see a b and popping b when you see a c;
Accept if both stacks are empty at the end of this process.
Turing machine:
Ensure a^n ... c^n by replacing each a with A and erasing a matching c;
Ensure A^n b^n by erasing matching pairs of A and b;
Accept if at the end of this process you have no more A and no more b, i.e., the tape has been completely cleared.
```
```One reason you've not managed to construct a pushdown automaton for this language, is because there isn't any. The Bar Hillel pumping lemma shows this.
To outline the proof, suppose it can be done. Then, for some p, each string larger than p can be partitioned to uvwxy, s.t.,
|vwx| < p
|vx| > 1
uvnwxny is also accepted by the automaton, for any n.
The first rule implies that vwx can't span the three regions, only at most two (for large enough strings). The second and third rules now imply that you can pump so that the un-spanned region is smaller than the at least one of the other regions.```

### Letter substitutions termination

```Given:
A char string S length l containing only characters from 'a' to 'z'
A set of ordered substitution rules R (in the form X->Y) where x, y are single letters from 'a' to 'z' (eg, 'a' -> ' e' could be a valid rule but 'ce'->'abc' would never be a valid rule)
When a rule r in R is applied on S, all letters of S which are equal to the left side of the rule r would be replaced by the letter in the right side of r, if the rule r cause any replacement in S, r is called triggered rule.
Flowchart (Algorithm) :
(1) Alternately apply all rules in R (following the order of rules in R) on S.
(2) While (there exists any 'triggered rule' DURING (1) ) : repeat (1)
(3) Terminate
The question is: Is there any way to determine if with a given string S and set R, the algorithm would terminate or not (running forever)
Example1 : (manually executed)
S = 'abcdef' R = { 'a'->'b' , 'b' -> 'c' }
(the order is implied the order of appearance from left to right of each rule)
Ater running algorithm on S and R:
(1.1): 'abcdef' --> 'bbcdef' --> 'cccdef'
(2.1): repeat (1) because there are 2 replacements during the (1.1)
(1.2): 'cccdef'
(2.2): continue to (3) because there is no replacement during the (1.2)
(3) : terminate the algorithm
=> The algorithm terminate with the given S and R
Example2:
S = 'abcdef' R = { 'a'->'b' , 'b' -> 'a' }
(the order is implied the appearance order from left to right of each rule)
Ater running algorithm on S and R:
(1.1): 'abcdef' --> 'bbcdef' --> 'abcdef'
(2.1): repeat (1) because there are 2 replacements during the (1.1)
(1.2): 'abcdef --> 'bbcdef' --> 'abcdef'
(2.2): repeat (1) because there are 2 replacements during the (1.2)
(1.3): ...... that would be alike (1.1) forever....
The step (3) (terminate) is never reached.
=> The algorithm won't terminate with the given S and R.
I worked on this and found no efficient algorithm for the question
"if the algorithm halts".
First idea came to my mind was to "find cycle" of letters which
are in triggered rules but the number of rules may be too large
for this idea to be ideal.
The second one is to propose a "threshold" for the time of the
repeat, if the threshold is exceeded, we conclude the algorithm
would not terninate.
The "threshold" could be choosen randomly, (as long as it big
enough) - this approach is not really compelling.
I am thinking that if there is any upper bound for the
"threshold" which ensures that we always get the right answer.
And I came up with threshold = 26 where 26 is the number of
letter from 'a' to 'z' - but I can't prove that it true (or not).
(I hope that It would be something like Bellman-Ford algorithm which determines negative cycle in a fixed number of step,..)
homework)
```
```One simple way to think about solving this is to consider a string of length 1 and see if the problem can loop for any given starting letter. Since the string's length is never changing, and applying a rule applies to each character in S independently, it suffices to consider just a string of length 1.
Now, start with a state diagram with 26 states - 1 for each letter of the alphabet. Now, for your state transitions, consider this process:
Apply the transitions from R 1 at a time in order, until you reach the end of R. If from a particular state (letter), you do not ever reach a new letter, you know that if you reach the starting letter, you terminate. Otherwise, after applying the entire sequence of R, you will end up with a new letter. This will be your new state.
Note that all state transitions are deterministic because we apply the entire sequence of R, not just the individual transitions. If we applied the individual transitions, we might get confused, because we might have a -> b, b->a, a->c. When looking at the individual operations, we might think there are two possible transitions from a (either to b or to c), but really, considering the entire sequence, we see definitively that a transitions to c.
You will be done creating your state diagram after considering the next states of each starting letter. Creating the entire state diagram in this manner requires 26 * |R| operations. If the state diagram contains a loop, then if the string S contains any of the letters in the loop, then it fails to halt, otherwise it will halt.
Alternatively, if you just consider halting after 26 iterations through the entire sequence from R, you can use that as well.```

### Conflicting Strings programming competition [closed]

```Given a set of strings, say
ap***
ab*le
a****
ab***
the problem is to find, given the number of strings and number of allowable differences, whether or not a set of strings is consistent.
So with the above set, the answer is "Yes", if we allow a single inconsistent string (the second one), but "No" if we allow no inconsistent strings.
What is the best algorithm, and what is the complexity?
Every single solution I come up with either requires looking at every single combination, or is simply wrong. For example, you can't just go through and add strings to a set (defining distinct as "incompatible"), because then **, ab ad will pass.
The actual problem (from ):
Problem M
In 2417 archaeologists discovered a large collection of 20th century text documents of vital his-
torical importance. Although there were many duplicated documents it was soon evident that, as
well as the damage due to time making much of the text illegible, there were also some disagree-
ments between them. However, it was noticed that groups of texts could be made consistent, i.e.
consistency between texts could be achieved by leaving out some (small) number of texts. For example, the texts:
ap***
ab*le
app*e
*p\**e
(where * denotes an illegible character) can be made consistent by removing just the second text.
Input will consist of a sequence of sets of texts. Each set will begin with a line specifying the
number of texts in the set, and the maximum number of texts which can be removed. This will be
followed by the individual texts, one per line. Each text consists of at least one and no more than
250 characters, either lower case letters or asterisks. All the texts in a set will be the same length
and there will be no more than 10,000 texts in a set. The sequence of sets is terminated by a line
containing two zeros (0 0).
Output for each set consists of a line containing one of the words ‘Yes’ or ‘No’ depending on
whether or not the set can be made consistent by removing at most the specified number of texts.
Sample input
4 1
ap***
ab*le
app*e
*pple
3 1
a
b
c
4 2
fred
ferd
derf
frd*
0 0
Sample output
Yes
No
No
```
```This feels homeworky, so I'm going to leave out a few details.
A trie can handle this pretty nicely. At any index where a given text contains an *, you make that text descend from all other leaves in the trie. Then you walk the trie, looking for any terminal node that matches enough texts.
The trie has at most n * m nodes, so adding another text is O(nm).
There's a complication in building the trie too. You have to add texts in the right order, and you have to check the proper order for each text index. Otherwise, you can end up with a situation where *b is not contained in the terminal node for ab. But doing that doesn't introduce any further algorithmic complexity.
The total time is O(mn^2). Walking the trie once it's built is O(nm), and adding a node is O(nm) for n nodes.
```
```I propose that you represent a set of consistent strings with a string and a count. The string has a letter at a position where any of the strings in the set has a letter, and an asterisk there otherwise. The count is the number of strings in the set. So {ab**, a*b*} = [abb*, 2].
Start off with a single representation, [**,0].
Each time you see a string X:
1) Add [X,1] to the set of representations
2) If it is consistent with any of the representations so far, create a new representation from the string and the representation - increment the count, and if necessary fix some more letters in the string. Add the new representation to the set of representations.
3) If you now have more than one representation with the same string, keep just one, with the count the maximum of those with that string.
4) Remove representations whose count is less than the number of strings seen so far minus the number of strings you are allowed to leave out.
5) - repeat from (1) with the next string
At the end the most plausible answer, if any, is the one with the largest count. Any consistent answer will have been created. The maximum number of representations on hand at any one time is the maximum number of possible answers at that stage, which is Choose(n, x) where N is the number of strings seen at that point and x is the number of texts you are allowed to discard. If x = 1 this is n(n-1)/2. You have to do this n times, and the other costs grow only with the length of the string, so I guess you have an O(mn^3) algorithm.```

### Turing machine diagram for enumerator

```I am supposed to draw an enumerator for the language 0^k1^k (k>=0). I am not sure how that is different from building a Turing machine state diagram for this language: the way I understand it is that I need to build an enumerator that recognizes the aforementioned language given all strings above {0,1} by simulating the Turing machine that recognizes this language on string i for i steps, which I couldn't think how to do using a state diagram, but my teacher has pointed out that this is how we prove the equivalence between an enumerator and a Turing machine, so I thought that what we have to do is use the transition function defined for enumerators which makes the diagram look similar to the Turing machine that recognizes 0^k1^k, only instead of moving to qaccept we move to qprint for inputs in the language, and then for inputs that must be rejected we print epsilon? But how do we go about producing an infinite number of strings above the alphabet {0,1}? At the initial state the work tape and the print tape are empty. Can someone clarify these points for me? Maybe I misunderstand.
```
```I think I finally have the enumerator notion clear, an enumerator is not supposed to read an input, it creates words in the language for which it is built:
here's the algorithm:
print epsilon on the output tape
write 01 on the work tape
go back to the front of the tape and copy its contents to the output tape
go back to the leftmost 0, replace it with 1, go to the rightmost 1 and add two 1's at the end.
go back to stage 3
```
```i thought of another slightly different algorithm that produces a smaller number of states, and uses only {0,blank} on its work tape:
```
```i think you might have an error there.
in stage 4, you wrote "go back to the leftmost 0, replace it with 1, go to the rightmost 1 and add two 1's at the end"
i think it should be: "go back to the leftmost 1, replace it with 0, go to the rightmost 1 and add two 1's at the end"```