Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.databases.ms-sqlserver > #556

Embedded <CRLF> in text file

From "Kurt Dicus" <kdicus@dicus.com>
Newsgroups comp.databases.ms-sqlserver
Subject Embedded <CRLF> in text file
Date 2011-07-21 00:17 -0700
Organization ETT newsserver
Message-ID <j08ju6$lhe$1@news.ett.com.ua> (permalink)

Show all headers | View raw


It's been years and years since I used a newsgroup.  Wow.  This brings back 
some memories!

Okay, my issue.

I have an output from a system that generates the following (snipped of 
relevant info).  The file is ANSI encoded.

<t> = tab
<c> = CRLF

This would be a perfect file!

ID<t>STATUS<t>DESCRIPTION<t>ACTIVE<c>
1<t>active<t>Test description short<t>true<c>
2<t>under construction<t>Test description 2<t>false<c>

What the file actually looks like.
ID<t>STATUS<t>DESCRIPTION<t>ACTIVE<c>
1<t>active<t>Test<c>   -- it is only the DESCRIPTION field that has the 
premature break
description<c>
long string<t>true<c>
2<t>under construction<t>Test description 2<t>false<c>

There are a limited set of values for the STATUS.  Let's just assume that 
active and under construction are the only two possible values.

Before you ask, no, I don't have any way to change the output from the 
source system -- unfortunately.

What do you think?  Should I just write a parsing script outside of SQL 
before the import that builds a correctly formatted file?

The logic might be something like...

There are four (and only four) different kinds of lines (excluding the field 
names)
(1) number<t>string<t>string<t>string<c>
(2) number<t>string<t>string<c>
(3) string<c>
(4) string<t>string<c>

The logic could be something like:
If (1), write line to output file.
If (2), write line to output file.  Look at next line.  If it is like (3) 
append to previous line and look at next line assuming previous line isn't 
complete yet.  If it is like (4), append to previous line and assume a 
completed line.
If (3), make sure previous line looks like (2) or (3).  If so, append to 
previous line.  If previous line is not like (2) or (3), flag an error.
If (4), make sure previous line is like (2) or (3).  If so, append to 
previous line.  If previous line is not like (2) or (3), flag an error.

Or, I could do the same basic thing within SQL.  What else?  I need the best 
idea.

Thank you to anyone (and everyone) who tries to help.

Kurt




Back to comp.databases.ms-sqlserver | Previous | NextNext in thread | Find similar


Thread

Embedded <CRLF> in text file "Kurt Dicus" <kdicus@dicus.com> - 2011-07-21 00:17 -0700
  Re: Embedded <CRLF> in text file Erland Sommarskog <esquel@sommarskog.se> - 2011-07-21 11:05 +0200
    Re: Embedded <CRLF> in text file "Kurt Dicus" <kdicus@dicus.com> - 2011-07-21 02:14 -0700

csiph-web