Friday, March 30, 2012

Immediate processing of large recordset with VB

I have a table with roughly 165,000,000 records. Within
this table is a subset of roughly 6,000,000 records I
need to retrieve and process sequentially with some VB
code. These records are identified by a numeric code in
1 field. There is an index built on this field. When I
run a simple query, it takes roughly 20 minutes to return
the subset of records.
My question: Is there a way to have SQL return record 1
immediately to the VB program so it can begin
processing? As it stands when I issue the ADO command to
build the recordset I am continuosly hitting timeouts. I
assume that's because the SQL server is building the
entire recordset to return to the client. While I can
conceiveably increase my timeout(s), I'd like the client
to start processing whe the SQL server has pulled the
first record, or block of 1000 or something, rather than
wait for the entire recordset to be assembled.
Thanks for your help.You can use:
SELECT ...
FROM ...
WHERE...
OPTION (FAST n)
Where n will be number of rows in the first batch that will returned
quicker. Note that using this option can slow down overal execution time of
your query, and by a not inconsiderable amount.
Why exactly do you need to process 6 million rows sequentially in VB? There
are many things that can be done quicker in T-SQL than in VB with a set that
large.
Jacco Schalkwijk
SQL Server MVP
"DB" <dave_nospam_blair@.adelphia.net> wrote in message
news:130301c52faf$e7457bc0$a501280a@.phx.gbl...
>I have a table with roughly 165,000,000 records. Within
> this table is a subset of roughly 6,000,000 records I
> need to retrieve and process sequentially with some VB
> code. These records are identified by a numeric code in
> 1 field. There is an index built on this field. When I
> run a simple query, it takes roughly 20 minutes to return
> the subset of records.
> My question: Is there a way to have SQL return record 1
> immediately to the VB program so it can begin
> processing? As it stands when I issue the ADO command to
> build the recordset I am continuosly hitting timeouts. I
> assume that's because the SQL server is building the
> entire recordset to return to the client. While I can
> conceiveably increase my timeout(s), I'd like the client
> to start processing whe the SQL server has pulled the
> first record, or block of 1000 or something, rather than
> wait for the entire recordset to be assembled.
> Thanks for your help.|||See clause OPTION of the select statement. There is an option FAST (xxx) tha
t
can help. If you are using ADO, also recommend to increase recordset propert
y
CacheSize.
Can you explain what are you trying to do?, may be somebody can come with a
set-based solution.
AMB
"DB" wrote:

> I have a table with roughly 165,000,000 records. Within
> this table is a subset of roughly 6,000,000 records I
> need to retrieve and process sequentially with some VB
> code. These records are identified by a numeric code in
> 1 field. There is an index built on this field. When I
> run a simple query, it takes roughly 20 minutes to return
> the subset of records.
> My question: Is there a way to have SQL return record 1
> immediately to the VB program so it can begin
> processing? As it stands when I issue the ADO command to
> build the recordset I am continuosly hitting timeouts. I
> assume that's because the SQL server is building the
> entire recordset to return to the client. While I can
> conceiveably increase my timeout(s), I'd like the client
> to start processing whe the SQL server has pulled the
> first record, or block of 1000 or something, rather than
> wait for the entire recordset to be assembled.
> Thanks for your help.
>|||You can use TOP to retrieve a batch of a predefined number of rows:
SELECT TOP 10000 WITH TIES col1, col2, ...
FROM YourTable
WHERE key_col > @.key_col
/* @.key_col = The primary key of the last value processed */
AND code = 1
ORDER BY key_col
Also consider whether your VB process would be better done in SQL or
located in the middle tier if it isn't already.
David Portas
SQL Server MVP
--|||> I am continuosly hitting timeouts. I
> assume that's because the SQL server is building the
> entire recordset to return to the client.
Yeah, some of the time that you have to wait might involve query processing
time but the rest of the time involves all that data flowing across the
network onto your PC. I assume that the memory usage on the PC that the VB
program is running on goes up as the 6 million rows are returned.
When you "process" a row of data is there any way to determine that the
specific row is processed? If so you could do something like:
SELECT TOP 1000
FROM YourTable
/*pull the rows (based on the numeric code) that you are interested in*/
WHERE SomeColumn = @.TheNumericCode
/*this pulls the rows that have not been processed*/
AND HasThisRowBeenProcessed = 0
/*important to make sure that the rows are processed in the correct order*/
ORDER BY ...
Keith
"DB" <dave_nospam_blair@.adelphia.net> wrote in message
news:130301c52faf$e7457bc0$a501280a@.phx.gbl...
> I have a table with roughly 165,000,000 records. Within
> this table is a subset of roughly 6,000,000 records I
> need to retrieve and process sequentially with some VB
> code. These records are identified by a numeric code in
> 1 field. There is an index built on this field. When I
> run a simple query, it takes roughly 20 minutes to return
> the subset of records.
> My question: Is there a way to have SQL return record 1
> immediately to the VB program so it can begin
> processing? As it stands when I issue the ADO command to
> build the recordset I am continuosly hitting timeouts. I
> assume that's because the SQL server is building the
> entire recordset to return to the client. While I can
> conceiveably increase my timeout(s), I'd like the client
> to start processing whe the SQL server has pulled the
> first record, or block of 1000 or something, rather than
> wait for the entire recordset to be assembled.
> Thanks for your help.|||I wonder is possible to indetify that subset of data? I mean, using an
indentity field.
In any case, carrying up that subset into a temp table and working with that
?
See you,
"Keith Kratochvil" wrote:

> Yeah, some of the time that you have to wait might involve query processin
g
> time but the rest of the time involves all that data flowing across the
> network onto your PC. I assume that the memory usage on the PC that the V
B
> program is running on goes up as the 6 million rows are returned.
> When you "process" a row of data is there any way to determine that the
> specific row is processed? If so you could do something like:
> SELECT TOP 1000
> FROM YourTable
> /*pull the rows (based on the numeric code) that you are interested in*/
> WHERE SomeColumn = @.TheNumericCode
> /*this pulls the rows that have not been processed*/
> AND HasThisRowBeenProcessed = 0
> /*important to make sure that the rows are processed in the correct order*
/
> ORDER BY ...
> --
> Keith
>
> "DB" <dave_nospam_blair@.adelphia.net> wrote in message
> news:130301c52faf$e7457bc0$a501280a@.phx.gbl...
>

No comments:

Post a Comment