My copy loop does not end. Turing Machine - turing-machines

I am trying to write a set of rules that copies a set of ticks, however long one space next to the original set, I have a loop which does so, however it does not stop and continues ahead and breaks the copied items.
http://i.imgur.com/8cpaYkN.png
under the cut off of the picture it should say 5 0 -> 1 Tick
This is based on the model found at: http://en.wikipedia.org/wiki/Turing_machine_examples#A_copy_subroutine
Any insight?
edit: it should halt when it reaches the middle zero between the two series of numbers, but mine keeps going.
edit:
So my program checks for a 1, if it finds it it turns it to a zero and skips the following ones until it reaches a zero, it skips the zero and the following ones (which there are none of at the start) and changes the first 0 to a 1, it then returns by skipping the ones and the zero, and then skipping the ones till it finds the first zero (the one that was changed) changes it to a one and then the program loops. It should stop when it reaches the centre zero that separates the two numbers.
It is like this.
State 1, 0 -> state 1, 0
State 1, 1 -> state 2, 1 [changes the first 1 to a 0]
state 2, 1 -> state 2, 1
state 2, 0 -> state 3, 0
state 3, 1 -> state 3, 1
state 3, 0 -> state 4, 1 (right) (goes back) [changes the first 0 to a 1]
state 4, 1 -> state 4, 1 (right)
state 4, 0 -> state 5, 0 (right)
state 5, 1 -> state 5, 1 (right)
state 5, 0 -> state 1, 1 (left) (loops) [changes the first changed 1 back]
If I do this for any sequence of ones, it will copy them, however, it will not stop the loop, and it will continue after it is finished and break the copy.
So if I input:
0 0 1 1 1 0 0 0 0 0 0.....
the rules will do the following:
0 0 1 1 1 0 0 0 0 0 0.....
0 0 0 1 1 0 1 0 0 0 0.....
0 0 1 0 1 0 1 1 0 0 0.....
0 0 1 1 0 0 1 1 1 0 0.....
0 0 1 1 1 0 1 1 1 0 0.....
(it is now supposed to stop but it keeps going, testing the new copied inputs.)

This line is wrong:
State 1, 1 -> state 2, 1 [changes the first 1 to a 0]
It should be
State 1, 1 -> state 2, 0 [changes the first 1 to a 0]
But I'm a bit puzzled, because I think the effect of this mistake it to keep growing the number of 1s on the left of the tape.
The procedure should work like this:
Starting at the first 1 in the input:
If the current symbol is 0, stop.
Replace it with a zero.
Scan for the middle zero.
Scan over the existing 1's.
Write a 1
Scan backwards for the second 0.
Replace it with a 1.
Go right once.
Repeat
Here's a working implementation in Python. I think I've got left/right the opposite way to you (I think of left/right as describing the movement of the write head rather than the tape). Otherwise, the only change to your program is the mistake I mentioned at the start of this answer.
def machine(tape, program, start, pc):
head = start
while True:
if head < 0:
raise AssertionError('off tape')
if head >= len(tape):
tape.append(0)
cmd = program[pc, tape[head]]
if cmd == 'halt': return
newpc, write, move = cmd
tape[head], head, pc = write, head + move, newpc
print pc, head, tape
L, R = -1, 1
tape = [1, 1, 1, 0]
program = {
(1, 0): 'halt',
(1, 1): (2, 0, R),
(2, 1): (2, 1, R),
(2, 0): (3, 0, R),
(3, 1): (3, 1, R),
(3, 0): (4, 1, L),
(4, 1): (4, 1, L),
(4, 0): (5, 0, L),
(5, 1): (5, 1, L),
(5, 0): (1, 1, R)
}
machine(tape, program, 0, 1)

Related

Get back to DataFrame after df.as_matrix()

I play with a dataset in pandas.
At some point I use it as matrix (df.as_matrix()) , then I do some transformations (with sklearn) and I want to go back to DataFrame.
How can I go back from df.as_matrix() back to df this the most straightworward way and with preserving indexes and col names?
Consider the data frame df
df = pd.DataFrame(1, list('xyz'), list('abc'))
df
a b c
x 1 1 1
y 1 1 1
z 1 1 1
as_matrix gives you:
df.as_matrix()
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
It is completely reasonable to go back to a data frame with
pd.DataFrame(df.as_matrix())
0 1 2
0 1 1 1
1 1 1 1
2 1 1 1
But you lose the index and column information.
If you still have that info lying around
pd.DataFrame(df.as_matrix(), df.index, df.columns)
a b c
x 1 1 1
y 1 1 1
z 1 1 1
And you are back where you started.

How to use lambda in DataFrame.assign to solve the ambiguity of truth value of Series?

I created a dataframe using the code below:
import pandas as pd
from numpy import exp
import random
moves = [(1, 2), (1, 3), (1, 4), (2, 1), (2, 3), (2, 4)]
data = {'moves': list(map(lambda i: moves[i] if divmod(i, len(moves))[0] != 1 else moves[divmod(i, len(moves))[1]],
[i for i in range(2 * len(moves))])),
'player': list(map(lambda i: 1 if i >= len(moves) else 2,
[i for i in range(2 * len(moves))])),
'wins': [random.randint(0, 2) for i in range(2 * len(moves))],
'playout_number': [random.randint(0,1) for i in range(2 * len(moves))]
}
frame = pd.DataFrame(data)
I want to add another column to dataframe using the code below:
total = sum(map(lambda a, b: exp(a/b) if b != 0 else 0, frame['wins'], frame['playout_number']))
frame = frame.assign(weight=lambda a: exp(a.wins/a.playout_number) / total if a.playout_number != 0 else 0)
when I used the code above I got this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I understand the reason for the error, But I don't know how to rewrite the lambda function to deal with the error. I want a function that if frame['playout'] is not zero return exp(frame['wins'] / frame['playout_number']) and return zero if playout_number were zero.
Can you guys help me with that?
#use apply to create the new weight column.
frame['weight'] = frame.apply(lambda x: exp(x['wins'] / x['playout_number']) / total if x['playout_number'] !=0 else 0, axis=1)
frame
Out[117]:
moves player playout_number wins weight
0 (1, 2) 2 0 0 0.000000
1 (1, 3) 2 0 1 0.000000
2 (1, 4) 2 1 0 0.070885
3 (2, 1) 2 1 0 0.070885
4 (2, 3) 2 0 1 0.000000
5 (2, 4) 2 1 2 0.523774
6 (1, 2) 1 1 1 0.192686
7 (1, 3) 1 0 1 0.000000
8 (1, 4) 1 0 0 0.000000
9 (2, 1) 1 1 0 0.070885
10 (2, 3) 1 1 0 0.070885
11 (2, 4) 1 0 2 0.000000

Aggregating a time series in Pandas given a window size

Lets say I have this data
a = pandas.Series([1,2,3,4,5,6,7,8])
a
Out[313]:
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
dtype: int64
I would like aggregate data which groups data n rows at a time and sums them up. So if n=2 the new series would look like {3,7,11,15}.
try this:
In [39]: a.groupby(a.index//2).sum()
Out[39]:
0 3
1 7
2 11
3 15
dtype: int64
In [41]: a.index//2
Out[41]: Int64Index([0, 0, 1, 1, 2, 2, 3, 3], dtype='int64')
n=3
In [42]: n=3
In [43]: a.groupby(a.index//n).sum()
Out[43]:
0 6
1 15
2 15
dtype: int64
In [44]: a.index//n
Out[44]: Int64Index([0, 0, 0, 1, 1, 1, 2, 2], dtype='int64')
you can use pandas rolling mean and get it like the following:
if n is your interval:
sums = list(a.rolling(n).sum()[n-1::n])
# Optional !!!
rem = len(a)%n
if rem != 0:
sums.append(a[-rem:].sum())
The first line perfectly adds the rows if the data can be properly divided into groups, else, we also can add the remaining sum (depends on your preference).
For e.g., in the above case, if n=3, then you may want to get either {6, 15, 15} or just {6, 15}. The code above is for the former case. And skipping the optional part gives you just {6, 15}.

How can I select rows into N groups per value of a certain column?

I have a table in the form
Span Available Time
A 0 0
B 1 0
C 1 0
A 1 1
B 0 1
C 1 1
... ... ...
A 1 N
B 0 N
C 1 N
I want to group this into groups of X Times per Span. So it would look like:
Span Available Time
A 1 0
A 0 1
... ... ...
A 1 X
B 1 0
B 1 1
... ... ...
B 0 X
C 0 0
C 1 1
... ... ...
C 0 X
A 1 X+1
A 0 X+2
... ... ...
A 1 2X
B 1 X+1
B 1 X+2
... ... ...
B 0 2X
... ... ...
... ... ...
A 0 N-X
A 1 N-X+1
... ... ...
A 0 N
B 1 N-X
B 0 N-X+1
... ... ...
B 1 N
C 0 N-X
C 1 N-X+1
... ... ...
C 1 N
Where X is a factor of N.
How can I group the data in this way using SQL or Spark's DataFrame API?
Also, how can I aggregate that table by X rows per span to get, for example, the percentage availability for the span from time 0 to X, X to 2X, etc.?
edit:
For context, each group of X rows represents a day, and the whole data set represents a week. So I want to aggregate the availability per day, per span.
edit:
Also, I know what X is. So I want to be able to say something like GROUP BY Span LIMIT X ORDER BY Time
edit:
As a final attempt to describe this better, I want the first X of the first span, then the first X of the next span, and then the first X of the last span, followed by the next X of the first span, the next X of the second span, etc., through to the last rows for each span.
Under the assumption that your time column contains a timestamp and you input data thus looks something like this example rdd:
val rdd = sc.parallelize(List(("A", 0, "2015-01-02 09:00:00"),
("A", 1, "2015-01-02 10:00:00"),
("A", 1, "2015-01-02 11:00:00"),
("B", 0, "2015-01-02 09:00:00"),
("B", 0, "2015-01-02 10:00:00"),
("B", 1, "2015-01-02 11:00:00"),
("A", 1, "2015-01-03 09:00:00"),
("A", 1, "2015-01-03 10:00:00"),
("A", 1, "2015-01-03 11:00:00"),
("B", 0, "2015-01-03 09:00:00"),
("B", 0, "2015-01-03 10:00:00"),
("B", 0, "2015-01-03 11:00:00")
))
you could achieve your grouping and aggregation like this:
rdd.map{case(span,availability,timestamp) => ((span,getDate(timestamp)), (List((availability, time)), availability, 1))}
.reduceByKey((v1,v2) => (v1._1 ++ v2._1, v1._2 + v2._2, v1._3 + v2._3))
.mapValues(v => (v._1, v._2.toDouble/v._3))
(Where getDate() is some function that will return the date from a timestamp.)
This will produce output in the format of (span, List((availability, time)), availability_percentage). For my example rdd the result will look like this:
(B,List((0,2015-01-02 09:00:00), (0,2015-01-02 10:00:00), (1,2015-01-02 11:00:00)),0.3333333333333333)
(A,List((0,2015-01-02 09:00:00), (1,2015-01-02 10:00:00), (1,2015-01-02 11:00:00)),0.6666666666666666)
(A,List((1,2015-01-03 09:00:00), (1,2015-01-03 10:00:00), (1,2015-01-03 11:00:00)),1.0)
(B,List((0,2015-01-03 09:00:00), (0,2015-01-03 10:00:00), (0,2015-01-03 11:00:00)),0.0)

Pandas dataframe operations

I have the following dataframe,
df = pd.DataFrame({
'CARD_NO': [000, 001, 002, 002, 001, 111],
'request_code': [2400,2200,2400,3300,5500,6600],
'merch_id': [1, 2, 1, 3, 3, 5],
'resp_code': [0, 1, 0, 1, 1, 1]})
Based on this requirement,
inquiries = df[(df.request_code == 2400) & (df.merch_id == 1) & (df.resp_code == 0)]
I need to flag records in df for which CARD_NO == CARD_NO where inquiries is True.
If inquiries returns:
[6 rows x 4 columns]
index CARD_NO merch_id request_code resp_code
0 0 1 2400 0
2 2 1 2400 0
Then df should look like so:
index CARD_NO merch_id request_code resp_code flag
0 0 1 2400 0 N
1 1 2 2200 1 N
2 2 1 2400 0 N
3 2 3 3300 1 Y
4 1 3 5500 1 N
5 111 5 6600 1 N
I've tried several merges, but cannot seem to get the result I want.
Any help would be greatly appreciated.
Thank you.
the following should work if I understand your question correctly, which is that you want to set the flag is ture only when the CARD_NO is in the filtered group but the row itself is not in the filtered group.
import numpy as np
filter = (df.request_code == 2400) & (df.merch_id == 1) & (df.resp_code == 0)
df['flag']=np.where(~filter & df.CARD_NO.isin(df.ix[filter, 'CARD_NO']), 'Y', 'N')
filtered = (df.request_code == 2400) & (df.merch_id == 1) & (df.resp_code == 0)
df["flag"] = filtered.map(lambda x: "Y" if x else "N")

Resources