Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for meta tags #1091

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

vinistock
Copy link
Contributor

@vinistock vinistock commented Feb 24, 2024

Motivation

Without being able to add SEO meta tags, such as description or keywords, documentation websites generated by RDoc often don't rank well in search engines. For example, Ruby's official documentation (generated by RDoc), is rarely the first result when searching for "Ruby documentation".

I think it would be really beneficial to allow developers to define meta tags for improving the SEO of their RDoc generated websites - and Ruby itself can benefit from it.

Implementation

I split the implementation by commit:

  1. Added meta_tags to options
  2. Added meta_tags to the RDoc::Task
  3. Used the meta_tags to create the entries in the HTML
  4. Started using meta_tags for RDoc's own documentation

Concerns

My main concern with this implementation is the format of the meta_tags option. I used a hash to provide flexibility in which meta tags the users can define. I believe the flexibility is important, because there are many meta tags and trying to account for all of them would be impractical (e.g.: keywords, description, og:description, og:title and so many others).

However, the current implementation of RDoc only allows for CLI-style options. Every option has to be passed as if it were coming from the command line (e.g.: --something=otherthing). This makes passing a hash to the meta_tags option quite weird. I'm currently converting it into a JSON, so that I can parse it back into a hash after the OptionParse extracts the options.

I don't really like the approach, but my two arguments for proposing this solution are:

  1. If we're going to support meta tags, I believe flexibility is needed. There's no point in supporting only keywords and description as two hardcoded options
  2. Refactoring RDoc to allow for options that don't conform to the CLI style is a bigger effort

Let me know what you think about this.

Testing this change

  1. Switch to this branch
  2. Run bundle exec rake rerdoc to generate the website
  3. Verify that the _site/index.html contains the following tags
<meta name="keywords" content="...">
<meta name="description" content="...">

@p8
Copy link
Contributor

p8 commented Feb 26, 2024

I'd like to see improvements to Ruby documentation SEO as well.
One of the reasons APIDock still scores high in results is because it has dedicated semantic pages for each method of an object.
So a page with <title>save (ActiveRecord::Base)</title> and <h1>save</h1> will always score better than a page with multiple methods and a more generic title tag.
This can't be fixed with better metatags (I think Google even ignores them for ranking and looks at the page content instead).

Metatags can still be useful, but I'm not sure defining global metatags is the right approach.
Google recommends unique descriptions for each page.
A user using Google benefits from seeing different descriptions and targeted keywords for Hash, URI and Numeric.
In Sdoc we use the class description for the description and the class methods for the keywords:
https://github.com/rails/sdoc/blob/master/lib/rdoc/generator/template/rails/class.rhtml#L10-L18

@vinistock
Copy link
Contributor Author

I like the idea of using declaration information as part of the meta tags. How does Sdoc handle a scenario where there's no single declaration on the top level? For example this code:

class Foo
end

class Bar
end

Does it consider the first declaration as the one to be used in meta tags? Or does it only support one top level namespace per file?

Also, how can we support meta tags for the main page (often the README)?

@p8
Copy link
Contributor

p8 commented Feb 28, 2024

How does Sdoc handle a scenario where there's no single declaration on the top level? For example this code:

Each class has it's own page. So both Foo and Bar would have their own pages.

Also, how can we support meta tags for the main page (often the README)?

Sdoc gets the description from the README I think.
The previous link was a bit outdated I think. This is what it uses on main:
https://github.com/rails/sdoc/blob/e9bb867eba81f48c402a129e688e810ec1fa387c/lib/rdoc/generator/template/rails/class.rhtml#L10-L18

lib/rdoc/options.rb Outdated Show resolved Hide resolved
@vinistock vinistock force-pushed the vs/add_support_for_meta_tags branch from eace592 to 07c4313 Compare June 8, 2024 22:47
@vinistock
Copy link
Contributor Author

In the latest commit, I changed the implementation to follow the approach suggested by @p8. I agree that page specific meta tags will definitely be better than a global definition. And it solves the issue of the weird JSON arguments.

I studied the approach SDoc is using a bit. It's using Nokogiri to find the first paragraph of text in any given page and then it uses that for the description.

I assumed that we cannot have a dependency on Nokogiri in RDoc, so I took a much simpler approach. The idea is to try to extract an excerpt of the class/module comment or the page's content up until the second period, the first period or at most 150 characters (which ever matches first).

It's definitely not perfect and it will be thrown off by other unrelated dot characters like the ones present in URLs or inside examples of method invocations in documented code. If you have any other ideas, please let me know.

@nobu I'd love to hear your thoughts on the approach.

@st0012 st0012 self-requested a review August 23, 2024 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants