Click here for NewsObjects Overview
Search Syntax (Headline Control)
Windows - Visual C++


Description
Once you understand the search syntax required to create effective search expressions, you can incorporate this syntax into NewsObjects to capture only the news relevant to your application.

You can use this search syntax in three Headline Properties: SearchCriteria, DateRange and Flags. In addition to these Headline properties, the Utility Control contains its own methods and properties that allow you to construct a valid search criteria string. Valid search criteria strings can be obtained from the following Utility properties: SearchMulti, SearchString, SearchWord, SubjectSearchString and TickerSearchString.

What is a Search String?
A search string contains one or more tokens. A token is a word, phrase, subject code, ticker code, feed name, profile short name, or date/time range. Each token contains a delimiter character that specifies the token's type (word, phrase, etc.).

For example, the following search string searches for stories about sports (using the subject code SPRT). The "\" is a delimiter indicating that the string is a subject code:

SEARCH = (SPRT\)
Your application can automatically generate the necessary delimiters or it may require the user to enter them.
You can combine the tokens in a search string with boolean AND, OR, and NOT operators. Each string can also contain one or more of the following types of keyword expressions:

These keyword expression types specify criteria for searches, such as the dates over which to search for stories.

Token Delimiters

The following table lists the currently available delimiters and their meanings:

Token

Delimiter

Meaning/Use

Subject Code

\or \XXXX

XXXX is an optional source identifier, up to four characters long.

Note: All currently defined source delimiters are two characters long.

MRG\
or
MRG\XXXX

For example, you could specify the following string to indicate a Reuters-native subject code for sports:
SPO\rt

Where SPO is the code defined by Reuters for sports stories and \rt is a delimiter indicating that the code is from Reuters.

(Note: \c and \f are already defined as delimiters, as described below. They cannot be used as subject code delimiters.)

Ticker Code

\c

Specifies a company's ticker symbol:

IBM\c

You can append this delimiter with the code for the feed from which you want the information. For example:

AAPL\cRT

collects stories about Apple computer from Reuters

Profile Code

\p

Specifies a profile short name:

AB\p

Feed

\f

Specifies a feed:

RT\f

Word or phrase

Quotation Marks
(" ")

Specifies a word or phrase:

"WORD"

"WORDbWORDbWORD"
(Note: b indicates a blank space character)

Boolean Operator

AND, OR, NOT

Specifies a boolean relationship:

"WORD" IBM\c AND

Reverse Polish Notation
A search string must contain one or more tokens, which may be followed by boolean operators. The string must follow the rules of Reverse Polish Notation (RPN).

Below are two equivalent ways of expressing following infix string using Reverse Polish Notation:

Infix String
(SUB1 SUB2 WORD) AND (SUB4 SUB5) AND NOT SUB6

RPN Ex. 1
SUB1\ SUB2\ OR "WORD" OR SUB4\ SUB5\ OR AND SUB6\ NOT AND

RPN Ex. 2
SUB1\ SUB2\ "WORD" OR OR SUB4\ SUB5\ OR SUB6\ NOT AND AND

Expression Types
The search syntax supports four distinct expression types, each indicated by a keyword. The different keywords correspond to the different Headline Properties as indicated:

Keyword

Property

PROFILE

Not currently supported

SEARCH

SearchCriteria

DATE

DateRange

FLAGS

Flags

Other keywords may be added in the future.

The following are examples of each expression type:

1. An expression for a Profile that has been registered on the server:

PROFILE = (CS\p), where CS is the "short name" of a registered profile.

This expression finds stories that match the profile named "CS."

Important! If you want to search for stories that match a single profile, it is more efficient to use a "PROFILE=" expression, as in this example, than to use a "SEARCH=" expression. However, you must use a "SEARCH=" expression if you want to search on a combination of profiles.

2. Boolean RPN expressions for combining Profiles that have been registered on the NewsEdge server.

You must use a "SEARCH =" expression to search on a combination of profiles:

SEARCH = (CS\p MA\p AND BN\p NOT AND), where CS, MA, and BN are the "short names" of registered profiles.

This expression finds stories that match the profiles named "CS" and "MA", but do not match the profile named "BN."

Note: If you want to search on a single profile, it is much more efficient to use a "PROFILE=" expression than a "SEARCH=" expression.

3. Boolean RPN expressions for one-time Searches:

SEARCH = (SUB1\ SUB2\ OR), where SUB1 and SUB2 are industry/subject codes.

This expression finds stories that match at least one of the two specified subject codes.

4. Date/Time ranges, which may include date/time-specific keywords:

DATERANGE = (12/27/1996 12/30/1996 BETWEEN)

This expression finds stories that were received on the NewsEdge server between December 27 and December 30, 1996.

5. Single character Flags that modify the search expression:

FLAGS = (AS), where A indicates "include priority headlines" and S indicates "sort by rank."

This expression finds stories that the news provider "flagged" as priority headlines, then sorts them by rank. The rank is also supplied by the news provider.

Date/Time Keywords
The table below lists the date/time-specific keywords:

Flag

Meaning

BETWEEN

Retrieve stories between the two specified date/time values.

BEFORE

Retrieve stories newer than the specified date/time.

AFTER

Retrieve stories older than the specified date/time.

Important! The BEFORE and AFTER keywords refer to the story's location in the list of stories on the server. Picture the story list as a LIFO stack, the first story in is pushed to the bottom of the stack by subsequent stories. When searching down through the stack, you reach the newer stories at the top of the stack BEFORE you reach the older stories at the bottom of the stack. Similarly, you reach the older stories at the bottom of the stack AFTER you reach the newer stories. For example:

Newest Story (listed BEFORE others)

 

 

 

 

 

 

 

Oldest Story (listed AFTER others)

Story H(12/22/96)

Story G(12/22/96)

Story F(12/22/96)

Story E(12/20/96)

Story D(12/20/96)

Story C(12/17/96)

Story B(12/17/96)

Story A(12/15/96)

Using this example, if you specify

DATETIME = (12/20/96 BEFORE)

The search returns only Stories F, G, and H. This is because, when searching the story stack, F, G, and H are the only stories that are located BEFORE the stories from 12/20/96 or earlier.

If you specify

DATETIME = (12/17/96 AFTER)

The search returns only story A. This is because, in the story stack, Story A is the only story that is located AFTER the stories from 12/17/96.

Flags
A flag is a single character indicating how to modify the search criteria. Flags are case-sensitive. Any private extensions you create must be lower case. The following table lists the currently defined flag values:

Keyword

Meaning

A

Include stories marked as Priority headlines by the news provider.

F

Include first takes.

S

Include subsequent takes.

X

Search headline text and story text.

R

Sort by rank.

Expanded Search Syntax
This section fully describes the search syntax. It uses the following conventions:

Convention

Denotes:

{...}

an optional item.

|

a logical OR.

..

the allowable range.

[ ]

one character, part of a set.

other

standard regular expression syntax.

 

Token

Values

<query>::=
 

 

 

 

 

 

 

PROFILE = (<prof_code>)|

PROFILE = (<prof_code>)<space>
     DATERANGE=(<date_expression>) |

SEARCH = (<exp>)|

SEARCH = (<exp>)<space>FLAGS=(<flags>) |

SEARCH = (<exp>)<space>DATERANGE=(<date_expression>) <space>FLAGS=(<flags>) |

DATERANGE = (<date_expression>)

 

<prof_code>::=

<prof_short_name>

<prof_short_name>::=

<dis_char><dis_char>

<dis_char>::=

displayable character

<exp>::=

<code>|
<exp><space><exp><space><bin_op>

<code>::=
 
 
 

<subject> |
<src> |
<ticker> |
<prof_short_name> |
<text>

<subject>::=

<subject_code>\ | <subject_code>\<sourceID>

<sourceID>::=

4-character news provider name

<subject_code>::=

subject code defined by NewsEdge Corporation.

<src>::=

<dis_char><dis_char>\f

<ticker>::=

<ticker_code>\c | <ticker_code>\c<sourceID>

<ticker_code>::=

company's stock ticker symbol

<text>::=
 
 

<word>|
<phrase>|
<text><space><word>|
<text><space><phrase>

<phrase>::=

"<words>"

<words>::=

<word> |
<word><space><words>

<word>::=

<char>|
<word><char>

<char>::=
 
 

a|b|..|z |
A|B|..|Z |
0|1|..|9 |
&|[|]|"|"|*

<bin_op>::=
 

AND |
OR |
NOT<space>AND

<space>::=

blank character

<date_expression>::=
 
 

<date_time> |
<date_time><space><date_time><space>BETWEEN |
<date_time><space>BEFORE |
<date_time><space>AFTER

<date_time>::=

<date> |
<date><space><time>

<date>::=

<month>/<day>/<year>

<month>::=

01..12

<day>::=

01..31

<year>::=

2 or 4 digit number

<time>::=

<hour>:<minute>

<hour>::=

0..23

<minute>::=

0..59

<flags>::=

<flag> |
<flags><flag>

<flag>::=
 
 
 

 

 

A|F|S|X|R

Where:

A - Access priority headlines,
F - Retrieve First takes,
S - Retrieve Subsequent takes,
X - search story teXt for a word or phrase,
R - sort by Rank

See also: DateRange Run Flags SearchCriteria SearchMulti SearchString SearchWord SubjectSearchString TickerSearchString

Back to: Headline Properties