Just a few days ago the team in Redmond has announced the general availability for Azure Search and other new announcements along with it.

For the past few months I had the opportunity to talk, blog and answer questions about Azure Search while it was still under public preview. Today however, the service is no longer in preview and this means that the search-as-a-service solution managed by Microsoft is now fully baked with SLA, stable and less-changing REST API schema and models which can be concluded as: full-text search in a box.

The purpose of Azure Search is to help software developers implement a search system within their applications (whether web, mobile or desktop) without the friction and complexity of writing SQL, JavaScript (or anything else) queries and with all the benefits of an administration-less system.

Not only did the team make the service generally available, but they also added some more flavor to this release since it comes out with great new features such as an indexer mechanism which allows Azure Search to literally crawl for data in any modern data repository such as Azure DocumentDB, Azure SQL Database or SQL Server running on Azure VMs and also the concept of suggesters (previously under preview in the 2014-10-20-Preview API version – I wrote about suggesters in the Azure Search Client Library update announcement here) which allows users to specify a suggest algorithm upon running the suggest operation available in Azure Search.

Additionally, a long waited official client library for C# has been made available in NuGet (here) which helps in making the overall search system development experience more familiar. This does indeed mean that I hereby announce that I will invest less time in further developing Azure Search Client Library – it’s been a great journey and experience, which I will certainly miss.

I did however take the liberty to write a small Getting Started application with the intent to help both you and any other developer interested in Azure Search to get a better and meaningful understanding of how the SDK works and eventually posted the result in the MSDN Code Gallery here.

As with most of my Azure Search demos, I took the scenario of a generic catalog of NFL games taking place across the U.S. For each event, the following information is stored: the name of the event, the description of the event, the category and rating of the event, the event’s associated tags, the location (as in the arena’s name and geolocation) and last but not least, the event’s unique identifier (key). In C# terms, I’ve created the so-called SearchableEvent class which maps into an Azure Search index type. The advantage of the Azure Search SDK library is that it offers both synchronous and the asynchronous omologue methods and works both for desktop and web projects, but also for projects designed for mobile applications. Additionally, the SDK has been written as a wrapper around the newest API version which implements new features such as Suggesters and Indexers.

Index Creation

AzSearchGettingStarted1

Right from the moment you start the GettingStarted application, the application will list the indexes catalog. Afterwards it will list the menu of available options; for the ease of use, should no index be created in the specified service, only options 1 and 0, corresponding for Create Index and Exit respectively, are available.

Choosing menu option 1 actually executes the method called CreateIndexAsync which takes the document schema as a parameter. As a best practice, especially because the document schema is used several times over the course of an application lifetime which takes advantage of the Azure Search features, it’s advisable to create a static method inside the document’s corresponding class. In the GettingStarted application, this is the GetSearchableEventFields() static method of the SearchableEvent class.

Index population

For the purpose of this demo, I’ve created an asynchronous method called AddDocumentsAsync() which is responsible of generating some mock data which is afterwards added to the index using the CreateDocumentsAsync() method. When you add new documents, the best practice is to add documents in a batch of 1,000 documents if possible.

        private static async Task AddDocumentsAsync() 
        { 
            try 
            { 
                // HERE is the region where I prepare some random mock data 
                var indexOperations = new List<IndexAction>(); 
 
                for (int i = 0; i < 1000; i++) 
                { 
                    var randStadium = rand.Next(27); 
                    var randTeamHome = rand.Next(30); 
                    var randTeamAway = rand.Next(30); 
                    var eventName = string.Format("{0} vs {1}", teams[randTeamHome].Name, teams[randTeamAway].Name); 
                    var eventTags = new string[4]; 
                    Array.Copy(teams[randTeamHome].Tags, 0, eventTags, 02); 
                    Array.Copy(teams[randTeamAway].Tags, 0, eventTags, 22); 
 
                    var document = new Document(); 
                    document.Add("category""sport"); 
                    document.Add("name", eventName); 
                    document.Add("tags", eventTags); 
                    document.Add("dateadded", DateTime.Now); 
                    document.Add("date", DateTime.Now.AddDays(rand.NextDouble() * 100)); 
                    document.Add("description", loremIpsum); 
                    document.Add("location", stadiums[randStadium].Name); 
                    document.Add("geolocation", stadiums[randStadium].GeoLocation); 
                    document.Add("key", Guid.NewGuid().ToString()); 
                    document.Add("rating", rand.Next(10)); 
                    indexOperations.Add(new IndexAction(IndexActionType.Upload, document)); 
                } 
 
                ValidateCurrentIndexName(); 
 
                Console.WriteLine(string.Format("Adding {0} documents into index {1}...", indexOperations.Count, currentIndexName)); 
 
                ValidateIndexClient(); 
                var result = await _indexClient.Documents.IndexAsync(new IndexBatch(indexOperations)); 
 
                if (result != null) 
                    ConsoleUtils.WriteColor(ConsoleColor.Green, "Added {0} documents into index {1} successfully.", indexOperations.Count, currentIndexName); 
            } 
            catch (Exception ex) 
            { 
                ConsoleUtils.WriteColor(ConsoleColor.Red, "{0}", ex.ToString()); 
            } 
        }

 

Document query

The basic operation when it comes to any search service is probably the query operation. In this GettingStarted demo, you get four different queries tried out:

  • one simple query, which also takes advantage of the ‘field narrowing’ feature and uses the ‘ALL’ text search mode
  • one query which searches using the ‘ANY’ text search mode
  • one query which shows the usage of facets
  • and lastly, a query which also uses scoring profiles

The useful thing about the Azure Search SDK is that it takes for the Search function a SearchParameter object as parameter, which contains all the various configuration and options when you query the index. In the GettingStarted application, it’s the GetSearchParameters()’s method responsibility to instantiate the SearchParameter object and pass it to the Index’s client Search() method. Hence, there is a single method which handles all the different searches, but which calls the GetSearchParameters differently.

Index update

One of best things about Azure Search’s  is its capability to quickly update an index, which means that if there’s a situation where the index’s schema changes (without impacting the existing document indexation though) or some new options must be added (such as adding either suggesters or scoring profiles), a simple call will do and there’s thus no need to recreate the index from scratch or upload the entire catalog of documents again.

For example, in the GettingStarted application, there are two index update calls which update the index’s scoring profiles. In order to test these calls, two queries taking advantage of these scoring profiles are prepared. In the following example, a scoring profile with two functions (a freshness function and a magnitude function) is added to an existing index.

        private async static Task UpdateIndexAsync() 
        { 
            try 
            { 
                ValidateCurrentIndexName(); 
 
                var function1 = new FreshnessScoringFunction() 
                { 
                    FieldName = "dateadded", 
                    Boost = 200, 
                    Parameters = new FreshnessScoringParameters(new TimeSpan(0050)), 
                    Interpolation = ScoringFunctionInterpolation.Logarithmic 
                }; 
 
                var function2 = new MagnitudeScoringFunction() 
                { 
                    Boost = 1000, 
                    Parameters = new MagnitudeScoringParameters(910), 
                    FieldName = "rating", 
                    Interpolation = ScoringFunctionInterpolation.Constant 
                }; 
 
                var scoringProfile1 = new ScoringProfile() 
                { 
                    Name = "default", 
                    FunctionAggregation = ScoringFunctionAggregation.Sum, 
                    Functions = new List { function1, function2 }, 
                }; 
 
                var eventIndex = (await _searchClient.Indexes.GetAsync(currentIndexName)).Index; 
                eventIndex.ScoringProfiles = new List { scoringProfile1 }; 
                var result = await _searchClient.Indexes.CreateOrUpdateAsync(eventIndex); 
                if (result != null 
                    && result.Index != null 
                    && result.StatusCode == HttpStatusCode.OK) 
                    ConsoleUtils.WriteColor(ConsoleColor.Green, "Index {0} updated successfully.", currentIndexName); 
            } 
            catch (Exception ex) 
            { 
                ConsoleUtils.WriteColor(ConsoleColor.Red, "{0}", ex.ToString()); 
            } 
        }

Document lookup

Considering that you take advantage of field narrowing when you get the list of a query results, it makes total sense to have  a document lookup operation available which will return all the retrieveable fields of that specific document, based on its key. Document lookup can be achieved by calling the generic Lookup method.

Conclusion

Considering that search is commonly one of the top reasons for potential customers to stop using your application within the first 30-seconds when they start using your application, it is extremely important to have a fully-functional and professional search mechanism in place from day 0. Happily, Azure yet again proves to be the right cloud-provider offering much more than just cloud infrastructure, this time by adding a professional search-service solution to their protofolio.

Leave a Reply

Post Navigation