The quick answer is that what you indicate as the fastest way to parse your date/time strings into a datetime
-type index, is indeed the fastest way. I timed some of your approaches and some others and this is what I get.
First,getting an example DataFrame
to work with:
import datetime
from pandas import *
start = datetime(2000, 1, 1)
end = datetime(2012, 12, 1)
d = DateRange(start, end, offset=datetools.Hour())
t_df = DataFrame({'field_1': np.array(['OFF', 'ON'])[np.random.random_integers(0, 1, d.size)], 'field_2': np.random.random_integers(0, 1, d.size)}, index=d)
Where:
In [1]: t_df.head()
Out[1]:
field_1 field_2
2000-01-01 00:00:00 ON 1
2000-01-01 01:00:00 OFF 0
2000-01-01 02:00:00 OFF 1
2000-01-01 03:00:00 OFF 1
2000-01-01 04:00:00 ON 1
In [2]: t_df.shape
Out[2]: (113233, 2)
This is an approx. 3.2MB file if you dump it on disk. We now need to drop the DataRange
type of your Index
and make it a list of str
to simulate how you would parse in your data:
t_df.index = t_df.index.map(str)
If you use parse_dates = True
when reading your data into a DataFrame
using read_table
you are looking at 9.5sec mean parse time:
In [3]: import numpy as np
In [4]: import timeit
In [5]: t_df.to_csv('data.tsv', sep='\t', index_label='date_time')
In [6]: t = timeit.Timer("from __main__ import read_table; read_table('data.tsv', sep='\t', index_col=0, parse_dates=True)")
In [7]: np.mean(t.repeat(10, number=1))
Out[7]: 9.5226533889770515
The other strategies rely on parsing your data into a DataFrame
first (negligible parsing time) and then converting your index to an Index
of datetime
objects:
In [8]: t = timeit.Timer("from __main__ import t_df, dateutil; map(dateutil.parser.parse, t_df.index.values)")
In [9]: np.mean(t.repeat(10, number=1))
Out[9]: 7.6590064525604244
In [10]: t = timeit.Timer("from __main__ import t_df, dateutil; t_df.index.map(dateutil.parser.parse)")
In [11]: np.mean(t.repeat(10, number=1))
Out[11]: 7.8106775999069216
In [12]: t = timeit.Timer("from __main__ import t_df, datetime; t_df.index.map(lambda x: datetime.strptime(x, \"%Y-%m-%d %H:%M:%S\"))")
Out[12]: 2.0389052629470825
In [13]: t = timeit.Timer("from __main__ import t_df, np; map(np.datetime_, t_df.index.values)")
In [14]: np.mean(t.repeat(10, number=1))
Out[14]: 3.8656840562820434
In [15]: t = timeit.Timer("from __main__ import t_df, np; map(np.datetime64, t_df.index.values)")
In [16]: np.mean(t.repeat(10, number=1))
Out[16]: 3.9244711160659791
And now for the winner:
In [17]: def f(s):
....: return datetime(int(s[0:4]),
....: int(s[5:7]),
....: int(s[8:10]),
....: int(s[11:13]),
....: int(s[14:16]),
....: int(s[17:19]))
....: t = timeit.Timer("from __main__ import t_df, f; t_df.index.map(f)")
....:
In [18]: np.mean(t.repeat(10, number=1))
Out[18]: 0.33927145004272463
When working with numpy
, pandas
or datetime
-type approaches, there definitely might be more optimizations to think of but it seems to me that staying with CPython's standard libraries and converting each date/time str
into a tupple of int
s and that into a datetime
instance is the fastest way to get what you want.
Best Answer
You should delete old child items
thisParent.ChildItems
one by one manually. Entity Framework doesn't do that for you. It finally cannot decide what you want to do with the old child items - if you want to throw them away or if you want to keep and assign them to other parent entities. You must tell Entity Framework your decision. But one of these two decisions you HAVE to make since the child entities cannot live alone without a reference to any parent in the database (due to the foreign key constraint). That's basically what the exception says.Edit
What I would do if child items could be added, updated and deleted:
Note: This is not tested. It's assuming that the child item collection is of type
ICollection
. (I usually haveIList
and then the code looks a bit different.) I've also stripped away all repository abstractions to keep it simple.I don't know if that is a good solution, but I believe that some kind of hard work along these lines must be done to take care of all kinds of changes in the navigation collection. I would also be happy to see an easier way of doing it.