Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bound functions leak global scope into strict functions #442

Open
szegedi opened this issue May 20, 2018 · 11 comments · May be fixed by #443
Open

Bound functions leak global scope into strict functions #442

szegedi opened this issue May 20, 2018 · 11 comments · May be fixed by #443
Assignees
Labels
block-scope Ecma Incompatibility Issues about Rhino being incompatible with the EcmaScript spec Strict Mode Issues related to non-compliance with the ES5 Strict Mode spec

Comments

@szegedi
Copy link
Contributor

szegedi commented May 20, 2018

While investigating #238 I noticed that our BoundFunction is replacing null "this" with global too early; it shouldn't be doing that at all (see #238 for relevant ES5 specification citations).

This expression needs to evaluate to true as strict functions aren't allowed to see global scope as their "this" (you can verify with either Nashorn or V8):

((function() { "use strict"; return this == null }).bind(null))()

Unfortunately, in Rhino it evaluates to false.

@szegedi szegedi self-assigned this May 20, 2018
@gbrail
Copy link
Collaborator

gbrail commented May 22, 2018

This looks like an important fix.

We have integration with test262 in the test suite now via a git submodule -- could you take a quick look and see if there are any tests in there that we could enable (by editing testsrc/test262.properties)? It'd be nice to have an independent test to verify this behavior.

@p-bakker
Copy link
Collaborator

This is an important fix and the issue isn't (just) tied to BoundFunction imho.

Been looking on whether I can fix this issue, and this is where things stand: in interpreted mode, I've got something that improves the implementation.

I changed the Token.THIS case in Intepreter.interpretLoop to the following:

                            case Token.THIS:
                                // TODO if topLevel scope is set as this through .apply/call/bind, then it should still be returned (thus not overwritten to Undefined)
                                stack[++stackTop] = ( // Conditionally returning a this value, to comply with strict mode
                                        // CHECKME shouldn't frame.isStrictTopFrame() also already take into account the current frame and thus take care of "frame.idata.isStrict"?
                                        (frame.isStrictTopFrame() || frame.idata.isStrict) && // if strict mode
                                        frame.scope.getParentScope() != null && // and not accessing 'this' in a topLevel scope directly
                                        (frame.thisObj == null || frame.thisObj.getParentScope() == null)) // and thisObj is either null or a topLevel scope
                                            ? Undefined.SCRIPTABLE_UNDEFINED
                                            : frame.thisObj;
                                continue Loop;

What this does is:

  • make a whole wack of tests from Test262 pass
  • breaks the tests for function.prototype.apply/call/bind where the provided this value is a topLevel scope
  • breaks a bunch of tests because there's code that tries to call some Scriptable methods (like .has and .getParentScope) which Undefined.SCRIPTABLE_UNDEFINED doesn't implement.

I'm wondering what is thought off this approach?

I have also tried the approach to intervene at the moment when the topLevel scope is fetched to send it into a .call(....) as the thisObj, but the problem I faced here is that I wasn't able to determine whether the .call() was occurring in strict mode:

  • the Function/Callable instance doesn't expose whether or not is must run in strict mode because its body starts with 'use strict'
  • on the CurrentContext, the flags/functions indicating strict true/false didn't return true for a function declaring 'use strict' in the body, while the 'outside' runs non-strict

Other that that, there are a lot of different calls to Callable/function.call() throughout the code (and potentially to custom implementations of those interfaces outside the Rhino codebase), so wasn't sure whether going this route could handle things properly in all scenario's.

If the basic gist of this approach seems feasible, then any thoughts on how to tackle the issues in the breaking tests?

For the breaking function.prototype.apply/call/bind where the providedthis value is a topLevel scope tests I was thinking of wrapping the this value set through those methods in a custom class and then in the Interpreter checking for that class and unwrapping the value before returning

For the tests breaking because the returned Undefined.SCRIPTABLE_UNDEFINED isn't a fully implemented Scriptable interface I could add the missing bits, but not sure of the impact of that

And then there's the non-interpreted implementation. Haven't wrapped my head around how the whole byteCode generation works, but I did find https://github.com/mozilla/rhino/compare/strict-bind-null which has some, so I might be able to take inspiration from there :-)

@rbri
Copy link
Collaborator

rbri commented May 25, 2021

Have also spend some time on this over the last weekend but only with frustrating results. Will focus on #909 for now, maybe this helps also to fix some strict-mode problems.

@p-bakker
Copy link
Collaborator

p-bakker commented May 25, 2021

Which pitfalls/challenges did you ran into?

What's your take on the approach I've taken?

@szegedi
Copy link
Contributor Author

szegedi commented May 25, 2021

Just a quick note to thank you @p-bakker for taking a stab at this. I myself have stopped working on this specifically because I didn't have more time to spend on it at the time, and what I thought will be a relatively small and contained fix turned out to cause a lot of breakage (as you yourself have noticed.) I'm willing to provide advice on this if you get stuck, but can't commit to seeing it through myself. I'll circle back to try to understand your changes sometime later.

@p-bakker
Copy link
Collaborator

p-bakker commented May 25, 2021

Tnx @szegedi I'll report back here if I get stuck or get somewhere.

Thought I was close earlier today, but the spec bit me :-) Anyway, I'll fiddle some more and if I get close, I'll create a draft PR

If I get things working in interpreted mode, I think I could use your help with the CodeGen part for the non-interpreted implementation.

@rbri
Copy link
Collaborator

rbri commented May 25, 2021

My plan was to try to enhance the PR from szegedi - but from the spec i saw that the this handling is something that has to be done in the function before the start. Followed this way and made some progress. But (again) this breaks some tests that are already passing. From my point of view these test are passing at the moment because we use this == null as detection if we are calling the ctor or new. So far i had no idea how to solve this.

And yes, i also found it really hard to match the interpreter code to the steps done by the optimizer.

@p-bakker
Copy link
Collaborator

p-bakker commented May 28, 2021

So, I've been playing around with 2 different approaches. Can share 2 different branches with implementation prototypes, but I want to ask everyone some questions first.

basic gist of how the this value is to be handled in strict mode (see The Strict model of JavaScript, bullit 11)

  • In strict mode, the this value is to default to undefined, instead of the top level scope.
  • If a this value is explicitly passed (via .call/.apply/.bind, the value can be anything, including primitives, which should NOT be boxed.
  • If a topLevel scope is explicitly passed to a strict function (in JavaScript through .apply/call/bind), this inside the strict function WILL be the provided topLevel scope

implementation challenges

With that high level overview in mind, below the basic challenges with implementing the this value part of strict mode

non-scriptable this values

Calls to functions are made through the Callable.call(Context cx, Scriptable scope, Scriptable thisObj, Object[] args) interface method.

As can be seen, the thisObj is typed as a Scriptable, meaning it cannot be a a primitive value

this value can be undefined (null in Java land)

The entire Rhino codebase (and potentially 3rd party code) just assumes the thisObj value passed into Callable.call is never null, as in all places there calls to it are made there's code to send in the topLevel scope if no other value has been provided

possible implementation routes

With the above mentioned implementation challenges I see two basis routes for implementing the this value part of strict mode

  1. Make the Callable interface support type Object for the thisObj
  2. Keep passing a Scriptable instance as thisObj into Callable.call, by internally wrapping primitives in a Scriptable and use Undefined.SCRIPTABLE_UNDEFINED for null/undefined values

In both cases, all Java code that does anything with the thisObj value needs to do more checking of the value it receives, because before it would always be a Scriptable instance and now, depending on the route taken, it could also be a primitive or null (route 1) or a 'WrapperScriptable' on which the code should not operate directly, but first get the actual value, which then can still be null or a primitive route 2)

While we can refactor all Rhino code and make sure everything works, there's 3rd party code to take into consideration. So either we make breaking changes and 3rd party code needs to adjust before they can upgrade or we try to somehow keep things compatible with 3rd party code (potentially with some flag to enable (some of) the new behavior).

IMHO, of the two implementation routes, the first is the cleaner approach, as it aligns more with the spec and isn't trying to jump through hoops. I've prototypes something that I think will be able to give us compatibility with 3rd party code: introduce a new EcmaCallable interface and make Callable extend this interface and redirect calls to the new interface method:

package org.mozilla.javascript;

public interface EcmaCallable {
    /**
     * Perform the call.
     *
     * @param cx the current Context for this thread
     * @param scope the scope to use to resolve properties.
     * @param thisObj the JavaScript <code>this</code> object
     * @param args the array of arguments
     * @return the result of the call
     */
    public Object call(Context cx, Scriptable scope, Object thisObj, Object[] args);
}

And

package org.mozilla.javascript;

import java.lang.reflect.Method;

/**
 * A version of EcmaCallable that implements the deprecated .call(...) variant taking a only a Scriptable as thisObj.
 * 
 * The difference between EcmaCallable and this interfact is that EcmaCallable supports the `this` handling defined in EcmaScript
 * where the `this` value can be anything, so not necicarrily a Scriptable instance
 */
public interface Callable extends EcmaCallable
{
    /**
     * Perform the call.
     * 
     * Default implemenation of the old callable interface, for 3rd party code that still calls Callable's that way
     * 
     * @deprecated Use {@link #call(Context cx, Scriptable scope, Object thisObj, Object[] args)} instead
     *
     * @param cx the current Context for this thread
     * @param scope the scope to use to resolve properties.
     * @param thisObj the JavaScript <code>this</code> object
     * @param args the array of arguments
     * 
     * @return the result of the call
     */
	@Deprecated
    default Object call(Context cx, Scriptable scope, Scriptable thisObj, Object[] args) {
		return call(cx, scope, (Object) thisObj, args);
	}
    
	//Object call(Context cx, Scriptable scope, Object thisObj, Object[] args);
	/**
	 * This version of call will be by functions by Rhino and be called by Rhino code that invokes functions
	 * 
	 * The default implementation is for 3rd party code that implemented the Callable interface before this version of the call method got implemented
	 */
    default Object call(Context cx, Scriptable scope, Object thisObj, Object[] args) {
    	// First prevent the unlikely event of oscillation between the two default methods in this interface
    	Method m = null;
    	
    	try {
    	   m = this.getClass().getMethod("call",  Context.class, Scriptable.class, Scriptable.class, Object[].class);
    	} catch (Exception e) {}
    			
    	if (m.isDefault()) throw new RuntimeException();

        // TODO exact logic here needs to be further thought out, but this is sort of the basic gist
        // CHECKME maybe also make what happens here depend on a global flag to opt in/out of this new behavior
    	if (thisObj == null) {
    		thisObj = cx.topCallScope;
    	} else if (!(thisObj instanceof Scriptable)) { // Should also check if NOT in strict mode, but cx.currentActivationCall(.isStrict) is null here...
        	Context.reportWarning("Coercing non-scriptable to Scriptable as ${} doesn't override call(Context cx, Scriptable scope, Object thisObj, Object[] args) method");
            thisObj = ScriptRuntime.toObject(cx, scope, thisObj);
        }

        return call(cx, scope, (Scriptable) thisObj, args);
    }
}

The question

So, the questions I have for all involved, assuming we do want to tackle the this value issue in strict mode:

  1. Which of the two options is preferred/looks more feasible (or does anyone have other idea's how to go about it?)
  2. Thoughts on whether to do this for the coming release or a (major?) release somewhere in the future
  3. Technical question: why does the Function interface seemingly redeclare the same interface method as the Callable interface?

One thing to note though: I've seen Lamba functions in the Java codebase that implement the Callable.call interface method. The above mentioned default method impl. of Callable.call conflicts with those, as Java doesn't support Lambda functions implementing a Interface method that is a default method. Changing them to implement the EcmaCallable.call solved the problem, but if 3rd party code does this....

@tuchida
Copy link
Contributor

tuchida commented May 28, 2021

I'm not sure if this has anything to do with this issue.

public static Callable getNameFunctionAndThis(String name, Context cx, Scriptable scope) {

getXXXFunctionAndThis is resolving this, but when the function to be called is BoundFunction or ArrowFunction, this is not necessary. In particular, ArrowFunction is currently not very efficient. Is there likely to be a way to improve performance?

@p-bakker
Copy link
Collaborator

p-bakker commented May 28, 2021

That is one of the places where the value for this is getting resolved to the topLevel scope (unconditionally).

The exact mechanism to resolve the this value and how to do it in the most efficient way (for example not at all for Array functions or functions running in strict mode) needs to be determined, but it might even be possible to resolve it only when it's actually requested: there's a good number of Callable.call implementations that don't do anything with the thisObj parameter value.

@p-bakker
Copy link
Collaborator

@gbrail @rbri @szegedi Any thoughts on #442 (comment)?

I'd like to proceed, but don't wanna go down a path that isn't likely to get accepted

@p-bakker p-bakker added the Strict Mode Issues related to non-compliance with the ES5 Strict Mode spec label Jun 29, 2021
@p-bakker p-bakker added the Ecma Incompatibility Issues about Rhino being incompatible with the EcmaScript spec label Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
block-scope Ecma Incompatibility Issues about Rhino being incompatible with the EcmaScript spec Strict Mode Issues related to non-compliance with the ES5 Strict Mode spec
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants