Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transformer to remove types if they existed under mappings #437

Closed
wants to merge 12 commits into from

Conversation

okhasawn
Copy link
Collaborator

Description

This PR adds a transformer that would remove "types" from a request if it existed under "mappings", otherwise it would keep the index's "properties", which used to be mistakenly removed.

Issues Resolved

#367

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
@okhasawn okhasawn changed the title Add transformer to remove types if under mappings Add transformer to remove types if they existed under mappings Nov 16, 2023
Copy link

codecov bot commented Nov 16, 2023

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (b41128e) 77.04% compared to head (d4bafa7) 72.64%.
Report is 1 commits behind head on main.

Files Patch % Lines
...g/opensearch/migrations/transform/Es2OsPost23.java 83.33% 0 Missing and 2 partials ⚠️
...search/migrations/replay/TransformationLoader.java 33.33% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #437      +/-   ##
============================================
- Coverage     77.04%   72.64%   -4.41%     
+ Complexity     1109     1100       -9     
============================================
  Files           141      116      -25     
  Lines          5769     4628    -1141     
  Branches        536      417     -119     
============================================
- Hits           4445     3362    -1083     
+ Misses         1047     1006      -41     
+ Partials        277      260      -17     
Flag Coverage Δ
unittests 72.64% <76.47%> (-0.79%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Comment on lines 8 to 9
implementation project(':replayerPlugins:jsonMessageTransformers:jsonJMESPathMessageTransformerProvider')
implementation group: 'io.burt', name: 'jmespath-core', version: '0.6.0'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need these two. Providers are basically just external entrypoints. You're providing your own. The jmespath library (or whatever implementation that the jsonJMESPathMessageTransformer package uses) will be picked automatically.
(see the next comment for how to shake the compile error that you would get)

public class Es2OsPost23 extends JsonJMESPathTransformerProvider {
public JsonJMESPathTransformer createTransformer(Object jsonConfig) {
String script = "{ \"settings\": settings, \"mappings\": payload.mappings.*.properties | [merge(@)] }";
return new JsonJMESPathTransformer(new JcfRuntime(), script);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - now I see why you had to add the dependency... I setup the c'tor to take the dependency because we could be using JSON nodes from jackson or something else - and those themself will require additional dependencies.

Anyway, please add an override constructor that defaults to the JcfRuntime & use that instead...

    public JsonJMESPathTransformer(String script) {
        this(new JcfRuntime(), script);
    }

@@ -0,0 +1 @@
org.opensearch.migrations.transform.Es2OsPost23
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the casing or the two different numbers. Since the 23 is used to represent a sequence number, we should probably not ALSO load a number as an abbreviation. That leaves something like EsToOsPost23. Then we have the casing. I didn't want to use underscores before, but they seem like the least bad option.... What do you think of ES_OS_Post23? We'll likely have lots of occurrences of ES/OS (& other things too), so coming up w/ abbreviations and establishing underscores could allow this to scale.

Comment on lines 3 to 5
public class Es2OsPost23Test {

}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the excision test is still in place & I presume that it is passing. Could you either delete this file or move the excision test into here?

Also please write a test that shows that your new transform fixes the bug that was reported. You can take a look at the normalizeJson functions in this PR and do that here with hardcoded input/output. (IntelliJ will escape as necessary if you paste INTO a string)

Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be another test case? Looking at #367, the reporter is migrating TO OS 2.8. Do we need to alter the original transformation or did we just need to disable it since we were running it on requests that didn't have type mappings? Should we be checking that we're removing the type mappings value and not just any value? Before approving, I just want to make sure that I understand how we're fixing the bug. Then you can add comments to it and mark it as fixed.

I'm still not sure about the name of the transform, but we're limited by using the classname since that has strong restrictions on valid characters. We'll also need to include "Provider" on them, OR choose to have less descriptively named providers.

I think there's another task to clean that up.

@okhasawn
Copy link
Collaborator Author

Should there be another test case? Looking at #367, the reporter is migrating TO OS 2.8. Do we need to alter the original transformation or did we just need to disable it since we were running it on requests that didn't have type mappings? Should we be checking that we're removing the type mappings value and not just any value? Before approving, I just want to make sure that I understand how we're fixing the bug. Then you can add comments to it and mark it as fixed.

I'm still not sure about the name of the transform, but we're limited by using the classname since that has strong restrictions on valid characters. We'll also need to include "Provider" on them, OR choose to have less descriptively named providers.

I think there's another task to clean that up.

I'll add another test case, I also think that the name of the transform should be changed, since what it essentially does is it removes type mappings, if and only if they existed, but keeps properties, therefore it doesn't remove any value, and would handle both cases.

I would still say that disabling it on requests that don't have type mappings can still be considered a solution to the bug.
If that can be detected and run transformations based on certain formats we want, then that can reduce the need for complicated transformations.

@@ -65,4 +65,4 @@ private static void transformAndVerifyResult(Map<String,Object> json, String exp
static IJsonTransformer getJsonTransformer() {
return new JsonTypeMappingTransformer();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be testing the new transformer & you should be deleting this one, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new one that uses JMES transformers.

Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to figure out a naming pattern, after some deliberation, I like SOURCE-TARGET where both source and target are structured as STORE_TYPE[/[..]VERSION[..]] where .. on the left means up to (& including). The trailing .. means after. Leaving the version off altogether implies all versions.

Examples would be

ES/..6.8-OS/2.3..
ES/..6.8-OS/..2.3
ES/..6.8-OS/2.4..
Solr-OS

public class Es2OsPost23 implements IJsonTransformerProvider {

@Override
public JsonJMESPathTransformer createTransformer(Object jsonConfig) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return an IJsonTransformer rather than a type of what should be in a transitive dependency. If you change this to be a composite transformer, your callers should not care, so don't leak implementation details here.

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

public class Es2OsPost23 implements IJsonTransformerProvider {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please affix this with TransformerProvider and override getName() to get the value that you want.

public JsonJMESPathTransformer createTransformer(Object jsonConfig) {
String script = "{ \"settings\": settings, \"mappings\": payload.mappings.*.properties | [merge(@)] }";
ObjectMapper objectMapper = new ObjectMapper();
Map<String, Object> jsonMap = objectMapper.convertValue(jsonConfig, Map.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the provider documentation for createTransformer...

     * @param jsonConfig is a List, Map, String, or null that should be used to configure the
     *                   IJsonTransformer that is being created

What are you trying to do with convert? If you have a map being passed in, this will work. Anything else will cause it to throw. You'd be better off to simply cast this to Map<String,Obiect>.

Comment on lines 12 to 16
ObjectMapper objectMapper = new ObjectMapper();
Map<String, Object> jsonMap = objectMapper.convertValue(jsonConfig, Map.class);
String script;

if (jsonMap.containsKey("mappings") && jsonMap.get("mappings") instanceof Map) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the config value coming in is meant to configure a transformer (like how you're passing the script to the JMESPath Transformer's c'tor). TransformationLoader only creates ONE transformer from a provider AND it passes the configuration, NOT a document, to createTransformer().

You'll want to make sure that you add another test that uses TransformationLoader to configure this transformer and then give it a couple documents. Those tests would have failed & you could have spotted the inconsistencies.

Map<String, Object> mappings = (Map<String, Object>) jsonMap.get("mappings");
if (mappings.containsKey("properties")) {
// No type - no transformation needed for "mappings" since "properties" is directly under it.
script = "{\"settings\": settings, \"mappings\": mappings}";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you cleanup the issues above, you can still have a Transformer that contains a couple JMESPath transformers within it as fields and switch which one you'd like to use within transformJson.

@@ -53,6 +65,15 @@ private static Map<String,Object> parseJsonFromResourceName(String resourceName)
}
}

private static void transformJMESAndVerifyResult(Map<String,Object> json, String expectedValueSource) throws Exception{
var jmesTransformer = getJMESTransformer(json);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests much be updated - they should send a config value (which should be null, since the user shouldn't have anything to configure) - NOT the document.

Signed-off-by: Omar Khasawneh <okhasawn@amazon.com>
@zelinh zelinh deleted the branch opensearch-project:main April 16, 2024 20:17
@zelinh zelinh closed this Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants