regex - Manipulating huge CSV files with sed -


I have a set of 4 large CSV files, which I need to modify. What I need to do is this expression Match (. *), Copy the atom to / , then include it later on each line until the atom is reconciled. Then I need to rinse and repeat till the end of the file (there are approximately 25k lines in each file). After all I need to go back and forth through the atom for the first time.

If this is possible, I would like to use sed for it. I tried to do it with VIM but regex could not get it right. Any help would be greatly appreciated. An example is illustrated below:

Earlier:

0917 ,,, 882-1273,1, 95F 9475,1, 276 -080, 1, 40, 0080,1, 275-690 A, 1, TX-2311,3, TX-3351,4, B-07432,1, B-6901,1, 23-753,1, 02F 4307,1,1 QBK-ND, 1, 0944-026,1, 0944-027,1, 0944-004,1, 0944-056,1, 0944-057,1, 0944-082,1, 0944- 024,1, 0944 -05,1, 0944-102,4, Lor 102,1 0918, CJ1085,1, 1352-152,4, DMS3102 A-18-, 1, 6-32 KEP, 7, 6-32 X 3/4, 4, 6-32 X1 / 2,4, 1251-102,8, Oct-32, 4, 10-32, SAE, 8

After:

  0917,882-1273,1 0917.95 F 9475,1 0917,276-080,1 0917,40 of 0080,1 0917,275-690 A, 1917, TX-2311,3 0917, TX-3351,4 0917, B-07432,1 9 17, B-6901,1 9, 23-753,1 0902 F 4307,1 9 09, 5.1 .1 QBK-ND, 1917,0 9 44-026,1 0917, 0944-027,1 0917,0 9 44-004,1 0917,0 9 44-056,1 0917,0 9 44-057, 1 0917,0 9 44-082,1 0917,0 9 44-024,1 0917,0 9 44-025,1 0917, 0944-102,4 9 17, Lor 102,1 0918, CJ1085,1 9, 18,1352-152,4 9 18, DMS3102 A-18-, 19618-32, KEP, 779 932 x 3 / 4,4 0918,6-32X1 / 2,4 918,1251- 102,8 0918, October-324 918,10-32, SAE 8,  

< P> program (python)

  import csv infile = file ("in", "r") outfile = file ("out", "w") reader = csv.reader (infile, Dialect = 'excel') writer = csv Author (outfile, dialect = 'excel') current_header = "" In the reader for the input: if lane (inro [0] .stip ())! = 0: current_header = inrow [0] Continue the author. The author (author [current_header] + INRO [1:]) infile.close () outfile.close () print "done"  

input

  0917 ,,, 882-1273, 1, 95 F 9475,1, 276-080,1, 40 of 0080,1, 275-690 A, 1, TX-2311,3, TX-3351,4, B74741 , B-6901,1, 23 -753,1, 02F4307,1, 5.1 QBK-ND, 1, 0944-026,1, 0944-027,1, 0944-004,1, 0944-056,1 , 0944-057,1, 0944- 081, 0944-024,1, 0944-025,1, 0944-102,4, Lor 102,1 0918, CJ1085,1, 1352-152,4, DMS 3102 a-18-, 1, 6-32 keip, 7, 6-32 x 3 / 4,4, 6-32 x 1 / 2,4, 1251-102,8, Oct-32, 4, 10-32 SAE, 8  

Output

  0917,882-1273,1 0917.95 F 9475,1 0917,276-080,1 0917,40 K 0080,1 0917,275-690 A, 1917, TX-2311, 9 17, TX-3351,4 0917, B74741, 0917, B-6901,1 0917, 23-753,1 0917,02 F4307,1 0917,5.1 QBK-ND, 1 0917,0 9 44-026,1 0917,0 9 44-027,1 0917,0 9 44 -004,1 0917,0 9 44-056,1 9 17, 0944-057,1 0917,0 9 44-082,1 0917,0 9 44-024,1 0917,0 9 44-025,1 0917,0 9 44-102, 4 0917, Lor 102,1 0918, CJ1085,1 9, 181352-152, 4 0918, DMS3102A-18-, 1918,6-32, KEP, 7791818-32X3 / 4,4 9 8 9 -32-X-2 / 2,4 9, 18, 1251-102,8, 0918, October -32, 4 0918, 10-32, SAE, 8  

Have fun


Comments

Popular posts from this blog

c# - ListView onScroll event -

PHP - get image from byte array -

Linux Terminal Problem with Non-Canonical Terminal I/O app -