Wednesday, March 21, 2012

IMAGE FIELD & Incremental population

Hi guys,
I'm doing some experiments with the Full Text Retrieval functionalities offered
by SQL-Server. I'm going to store documents in IMAGE fields to leverage the indexing capabilities of MS-Search service and there're some point which are not clear to me.
I want to use the incremental indexing (Timestamp-based or Change tracking) but in BOL it's stated that "changes made with WRITETEXT and UPDATETEXT are not detected".
I've a table where FT is enabled on some text fields (varchar(n)) and on an IMAGE field where the document content is stored.
Change Tracking & Update Index in background are enabled on this test table.
If the IMAGE field is loaded when the indexing on the other fields has been already performed, no change is detected and the IMAGE field is not indexed (loading is perfomed using textcopy.exe which use WRITETEXT).
Forcing the insert of all FT fields at the same time (i.e. same transaction), they are all indexed.
Unfortunately there're situations in which the doc. body is updated (e.g. new doc. version). The only way I found out to obtain the re-indexing of IMAGE field is the following: after the IMAGE update, I also update one of the other text field with FT enab
led.
It's a fake update: the field is "updated" with its current content. In this way the record is "marked" as changed and MS-Search indexes the IMAGE field too.
Of couse I can start a full population but this is not an option in a production environment.
I'm wondering which is the clean way to handle this situation.
How can I be sure that all the fields of a record have been indexed ?
Can I signal to MS-Search that a record must be re-indexed ?
Many thanks
Max
I think you have the cleanest solution to this problem.
There is a way of logging every record that is indexed/reindexed, but there
is a slight performance penaly to pay for this and then figuring out which
record is being indexed is also difficult.
Hilary Cotter
Looking for a book on SQL Server replication?
http://www.nwsu.com/0974973602.html
"MadMax" <MadMax@.discussions.microsoft.com> wrote in message
news:0C589D14-E149-4F7D-BE23-779E2E1D9380@.microsoft.com...
> Hi guys,
> I'm doing some experiments with the Full Text Retrieval functionalities
offered
> by SQL-Server. I'm going to store documents in IMAGE fields to leverage
the indexing capabilities of MS-Search service and there're some point which
are not clear to me.
> I want to use the incremental indexing (Timestamp-based or Change
tracking) but in BOL it's stated that "changes made with WRITETEXT and
UPDATETEXT are not detected".
> I've a table where FT is enabled on some text fields (varchar(n)) and on
an IMAGE field where the document content is stored.
> Change Tracking & Update Index in background are enabled on this test
table.
> If the IMAGE field is loaded when the indexing on the other fields has
been already performed, no change is detected and the IMAGE field is not
indexed (loading is perfomed using textcopy.exe which use WRITETEXT).
> Forcing the insert of all FT fields at the same time (i.e. same
transaction), they are all indexed.
> Unfortunately there're situations in which the doc. body is updated (e.g.
new doc. version). The only way I found out to obtain the re-indexing of
IMAGE field is the following: after the IMAGE update, I also update one of
the other text field with FT enabled.
> It's a fake update: the field is "updated" with its current content. In
this way the record is "marked" as changed and MS-Search indexes the IMAGE
field too.
> Of couse I can start a full population but this is not an option in a
production environment.
> I'm wondering which is the clean way to handle this situation.
> How can I be sure that all the fields of a record have been indexed ?
> Can I signal to MS-Search that a record must be re-indexed ?
> Many thanks
> Max
>

No comments:

Post a Comment