Skip to content

[MediaWiki] Generators

Chen edited this page Nov 8, 2017 · 32 revisions

Library references

How to work with IAsyncEnumerable<T>

IAsyncEnumerable<T> and IAsyncEnumerator<T> are introduced in Ix.Async package as asynchronous counterpart for IEnumerable<T> and IEnumerator<T>. With Ix.Async package, You can consume these asynchronous enumerators in a somewhat similar manner as you are working with ordinary enumerators.

  • You can use all the LINQ extension methods on IAsyncEnumerator<T>.
  • You can use Rx.NET package to convert IAsyncEnumerator<T> to IObservable<T>, if necessary.
  • For now, you can consume the items in IAsyncEnumerator<T> sequentially using the expanded for-each pattern. (See ShowAllTemplatesAsync method below for example); later when async for is introduced into C# 8 (hopefully), you might be able to use async for each on IAsyncEnumerator<T>.

Some caveats when consuming the IAsyncEnumerator<T> taken out from generator classes in Wiki Client Library:

  • Query continuations are carried out automatically by WCL. You just need to keep enumerating. WCL will request for more results from server when necessary.
  • Choose a proper PaginationSize. It decides (at most) how many items are to be fetched from server in one MediaWiki API request. So for example, if you are working with top 50 items from RecentChangesGenerator, you might choose 50 rather than 10 (by default) as PaginationSize value, so they will all be fetched at once.
  • A common idiom for fetching a small number of results from the generator
static async Task ShowRecentChangesAsync()
{
    var generator = new RecentChangesGenerator(myWikiSite)
    {
        // Choose wisely.
        PaginationSize = 50,
        // Configure the generator, e.g. setting filter/sorting criteria
        NamespaceIds = new[] {BuiltInNamespaces.Main, BuiltInNamespaces.File},
        AnonymousFilter = PropertyFilterOption.WithProperty
    };
    // Gets the latest 50 changes made to article and File: namespace,
    // by anonymous users.
    var items = await generator.EnumItemsAsync().Take(50).ToList();
    foreach (var i in items)
    {
        Console.WriteLine(i.Title);
        // Show revision comments.
        Console.Write("\t");
        Console.WriteLine(i.Comment);
    }
    // Gets the latest 50 pages in article and File: namespace that were changed
    // by anonymous users.
    var pages = await generator.EnumPagesAsync(PageQueryOptions.FetchExtract).Take(50).ToList();
    foreach (var i in pages)
    {
        Console.WriteLine(i.Title);
        // Show abstract for each revised page.
        Console.Write("\t");
        Console.WriteLine(i.Extract);
    }
}

How to consume IWikiList-implementation classes

static async Task SearchAsync()
{
    Console.Write("Enter your search keyword: ");
    var generator = new SearchGenerator(myWikiSite, Console.ReadLine())
    {
        PaginationSize = 22
    };
    // We are only interested in the top 20 items.
    foreach (var item in await generator.EnumItemsAsync().Take(20).ToList())
    {
        Console.WriteLine(item);
        Console.WriteLine("\t{0}", item.Snippet);
    }
}

Most of the WikiPageGenerator-derived classes (including AllPagesGenerator) implement IWikiListGenerator<WikiPageStub>, i.e., .EnumItemsAsync() will return a sequence of WikiPageStub. If you are only interested in the titles of the pages, consider using .EnumItemsAsync() instead of .EnumPagesAsync().

Still, there are some classes implementing IWikiList<T> where T is something other than WikiPageStub, including

  • class RecentChangesGenerator : WikiPageGenerator<RecentChangeItem, WikiPage>, IWikiList<RecentChangeItem>, IWikiPageGenerator<WikiPage>
  • class RecentChangesGenerator : WikiPageGenerator<RecentChangeItem, WikiPage>, IWikiList<RecentChangeItem>, IWikiPageGenerator<WikiPage>
  • class SearchGenerator : WikiPageGenerator<SearchResultItem, WikiPage>, IWikiList<SearchResultItem>, IWikiPageGenerator<WikiPage>
  • class GeoSearchGenerator : WikiPageGenerator<GeoSearchResultItem, WikiPage>, IWikiList<GeoSearchResultItem>, IWikiPageGenerator<WikiPage>
  • class RevisionsGenerator : WikiPagePropertyGenerator<Revision, WikiPage>, IWikiList<Revision>, IWikiPageGenerator<WikiPage> The items These

How to consume IWikiPageGenerator-implementation classes

static async Task ShowAllTemplatesAsync()
{
    var generator = new AllPagesGenerator(myWikiSite)
    {
        StartTitle = "A",
        NamespaceId = BuiltInNamespaces.Template,
        PaginationSize = 50
    };
    // You can specify EnumPagesAsync(PageQueryOptions.FetchContent),
    // if you are interested in the content of each page
    using (var enumerator = generator.EnumPagesAsync().GetEnumerator())
    {
        int index = 0;
        // Before the advent of "async for" (might be introduced in C# 8),
        // to handle the items in sequence one by one, we need to use
        // the expanded for-each pattern.
        while (await enumerator.MoveNext(CancellationToken.None))
        {
            var page = enumerator.Current;
            Console.WriteLine("{0}: {1}", index, page);
            index++;
            // Prompt user to continue listing, every 50 pages.
            if (index % 50 == 0)
            {
                Console.WriteLine("Esc to exit, any other key for next page.");
                if(Console.ReadKey().Key == ConsoleKey.Escape)
                    break;
            }
        }
    }
}