Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more invalid test cases for parsing entitly declaration #183

Merged
merged 19 commits into from
Jul 24, 2024

Conversation

Watson1978
Copy link
Contributor

This patch will add the test cases to verify that it raises an exception properly when parsing malformed entity declaration.

Comment on lines 163 to 164
class TestCharacterCode < self
def test_pubid_literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestCharacterCode < self
def test_pubid_literal
class TestPublicIDLiteral < self
def test_invalid_pubid_char

Could you add PubidLiteral and PubidChar definitions as a comment to understand why this value is invalid?

class TestCharacterCode < self
def test_pubid_literal
exception = assert_raise(REXML::ParseException) do
REXML::Document.new('<!DOCTYPE root [<!ENTITY foo PUBLIC "あいう" "bar" > ]>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use \U{...} instead of to keep this code ASCII only?

@@ -23,6 +23,158 @@ def parse(internal_subset)
end

public

class TestGarbagedValue < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use syntax part name such as GeneralEntityDeclration instead of GarbgedValue for test case name? We want to organize tests based of syntax part not invalid type to avoid missing case.

Suggested change
class TestGarbagedValue < self
class TestGeneralEntityDeclration < self
# Cover all GEDecl cases in this instead of spreading multiple test cases
def test_invalid_entity_value
def test_invalid_external_id

Could you also add generic entity declaration definition (and related definitions in this test case) as a comment for easy to understand?

class TestGarbagedValue < self
def test_garbaged_after_entity_definition
exception = assert_raise(REXML::ParseException) do
REXML::Document.new('<!DOCTYPE root [<!ENTITY "foo" xxx > ]>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use more easy to understand sample values instead of "foo" and xxx?
For example, "valid-name" and invalid-external-id.

@Watson1978
Copy link
Contributor Author

@kou Thank you for your review. I fixed the tests.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DETAIL
end

class TestSystemExternalEntities < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestSystemExternalEntities < self
class TestSystemLiteral < self

end

class TestSystemExternalEntities < self
def test_invalid_external_entity_before_system_literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_invalid_external_entity_before_system_literal
def test_no_quote

class TestSystemExternalEntities < self
def test_invalid_external_entity_before_system_literal
exception = assert_raise(REXML::ParseException) do
REXML::Document.new('<!DOCTYPE root [<!ENTITY valid-name SYSTEM invalid-external-entity "valid-system-literal" > ]>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "valid-system-literal" needed here?

DETAIL
end

def test_invalid_external_entity_after_system_literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you split this to TestNotationDataDeclaration?

Suggested change
def test_invalid_external_entity_after_system_literal
def test_not_ndata


def test_invalid_external_entity_after_system_literal
exception = assert_raise(REXML::ParseException) do
REXML::Document.new('<!DOCTYPE root [<!ENTITY valid-name SYSTEM "valid-system-literal" invalid-external-entity > ]>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not-ndata or something is better than invalid-external-entity because it should be NDATA.

end
end

class TestPublicExternalEntities < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestPublicExternalEntities < self
class TestPublicLiteral < self

Line: 1
Position: 76
Last 80 unconsumed characters:
valid-name PUBLIC \"あ\" \"valid-system-literal\" > ]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use \uXXXX?

This patch will add the test cases to verify that it raises an exception properly when parsing malformed entity declaration.
Fix test class names
@Watson1978
Copy link
Contributor Author

I completely rewrote the code and added tests to cover the specification.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add cases that has garbage after valid pattern?

Comment on lines 27 to 28
class TestGeneralEntityDeclaration < self
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-GEDecl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use a comment for class instead of a comment in class?

Suggested change
class TestGeneralEntityDeclaration < self
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-GEDecl
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-GEDecl
class TestGeneralEntityDeclaration < self

Comment on lines 31 to 33
class TestEntityValue < self
def test_no_quote
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityValue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestEntityValue < self
def test_no_quote
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityValue
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityValue
class TestEntityValue < self
def test_no_quote

def test_no_quote
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityValue
exception = assert_raise(REXML::ParseException) do
REXML::Document.new('<!DOCTYPE root [<!ENTITY valid-name invalid-entity-value > ]>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove needless spaces?

Suggested change
REXML::Document.new('<!DOCTYPE root [<!ENTITY valid-name invalid-entity-value > ]>')
REXML::Document.new('<!DOCTYPE root [<!ENTITY valid-name invalid-entity-value>]>')

DETAIL
end

def test_invalid_entity_value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use why this is an invalid value for test name instead of test_invalid_entity_value like you did for test_no_quote?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the method name at 98e0d8f

DETAIL
end

class TestExternalId < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestExternalId < self
class TestExternalID < self

end

class TestNDataDeclaration < self
def test_no_quote
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "no quote" a suitable reason?
I think that NDATA is only the valid keyword.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry. It is my mistake.

end
end

class TestNDataDeclaration < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a test for invalid Name in NDataDecl ::= S 'NDATA' S Name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added tests for Name at 9ecad7c


class TestParsedEntityDeclaration < self
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PEDecl
class ParsedEntityDefinition < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class ParsedEntityDefinition < self
class TestParsedEntityDefinition < self

Comment on lines 147 to 148
class TestEntityValue < self
class TestEntityValue < self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reduce a needless class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry. It is my mistake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed at 77b23ae

def test_unnecessary_ndata_declaration
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PEDef
exception = assert_raise(REXML::ParseException) do
REXML::Document.new('<!DOCTYPE root [<!ENTITY % valid-name "valid-entity-value" "NDATA" valid-ndata-value > ]>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
REXML::Document.new('<!DOCTYPE root [<!ENTITY % valid-name "valid-entity-value" "NDATA" valid-ndata-value > ]>')
REXML::Document.new('<!DOCTYPE root [<!ENTITY % valid-name "valid-entity-value" NDATA valid-ndata-value > ]>')

BTW, should we use the ExternalID NDataDecl? part instead of the EntityValue part in EntityDef ::= EntityValue | ExternalID NDataDecl?)? If we want to use the EntityValue part, we should use "garbage" or something instead of ndata_declaration because all contents (including NDataDecl) are invalid after the EntityValue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sence.
We should make sure to check that any unnecessary NDataDecl after EntityValue will raise an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test at 4782084

Copy link
Contributor

@otegami otegami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding test cases which follows XML's entity declarations.


Should we test PEReference or Reference which are included in EntityValue although those test cases might be out of scope ?
And also, how about adding the following test cases?

  • Using single quotes
  • Using mixed quotes (' dummy " or " dummy ')
  • Removing necessary spaces

DETAIL
end

def test_unnecessary_ndata_declaration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this to TestEntityDefinition and use test_entity_value_and_notation_data_declaration?

This test case is for EntityValue not EntityDef.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed at c211ab5

@Watson1978
Copy link
Contributor Author

* Using single quotes

I want to add invalid cases in this PR. I think this case is valid test.
So, I didn't add this case.

* Using mixed quotes (`' dummy "` or `" dummy '`)

* Removing necessary spaces

I added at

Thanks

Copy link
Contributor

@otegami otegami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing it. Although I made a few comments, it looks good overall !

test/parse/test_entity_declaration.rb Outdated Show resolved Hide resolved
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-ExternalID
class TestExternalID < self
# https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-SystemLiteral
class TestSystemLiteral < self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding the cases that there is no SystemLiteral ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I added the test cases at 8b42c31

Watson1978 and others added 2 commits July 23, 2024 18:43
Co-authored-by: takuya kodama <a.s.takuya1026@gmail.com>
Copy link
Contributor

@otegami otegami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks nice to me, though I left a small comment 👍🏾

test/parse/test_entity_declaration.rb Outdated Show resolved Hide resolved

def test_mixed_quote
exception = assert_raise(REXML::ParseException) do
REXML::Document.new('<!DOCTYPE root [<!ENTITY valid-name "invalid-entity-value\'>]>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use " for all string literals to avoid the same problem as #166 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed at 556487b

Line: 1
Position: 61
Last 80 unconsumed characters:
valid-name \"invalid-entity-value'>]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need \ before " in here document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The escapes are unnecessary.
I removed them at 4bada10

Watson1978 and others added 2 commits July 24, 2024 09:39
Co-authored-by: takuya kodama <a.s.takuya1026@gmail.com>
Copy link
Contributor

@otegami otegami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏾

@kou kou merged commit 086287c into ruby:master Jul 24, 2024
61 checks passed
@kou
Copy link
Member

kou commented Jul 24, 2024

Thanks.

@Watson1978 Watson1978 deleted the add-tests branch July 24, 2024 08:01
@sebm123
Copy link

sebm123 commented Jul 30, 2024

This patch will add the test cases to verify that it raises an exception properly when parsing malformed entity declaration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants