Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPropertyValue::ensureArray needs better "(...)" handling without eval #852

Open
belisoful opened this issue Dec 26, 2022 · 5 comments
Open

Comments

@belisoful
Copy link
Member

When TPropertyValue was written, it was PHP4/5. Things have changed. Standard PHP arrays are not "array()" but "[]".

So I propose that TPropertyValue::ensureArray be tweaked for the [newer] versions (7 & 8) of PHP format of array "[...]"

	public static function ensureArray($value): array
	{
		if (is_string($value)) {
			$value = trim($value);
			$len = strlen($value);
			if ($len >= 2 && $value[0] == '(' && $value[$len - 1] == ')') {
				return eval('return array' . $value . ';');
			} if ($len >= 2 && $value[0] == '[' && $value[$len - 1] == ']') {
				return eval('return ' . $value . ';');
			} else {
				return $len > 0 ? [$value] : [];
			}
		} else {
			return (array) $value;
		}
	}

Also, because Arrays are created by eval(), Using this function to create an array from user input could be a security risk. While that is not a typical use case, it is possible.

To reduce the security footprint of this function, I propose checking the input string for "Prado::" and not performing the eval if it's there. This is a data scrubbing function, not an "execute system routines" function.

An array string of "(Prado::getApplication()->getModule-('aModule')->flushDatabase())" might do something bad, so limiting the array eval to not access "Prado::" may be wise. For security purposes, I propose the following:

	public static function ensureArray($value): array
	{
		if (is_string($value)) {
			$value = trim($value);
			$len = strlen($value);
			if ($len >= 2 && $value[0] == '(' && $value[$len - 1] == ')' && stripos($value, "Prado::") === false) {
				return eval('return array' . $value . ';');
			} if ($len >= 2 && $value[0] == '[' && $value[$len - 1] == ']' && stripos($value, "Prado::") === false) {
				return eval('return ' . $value . ';');
			} else {
				return $len > 0 ? [$value] : [];
			}
		} else {
			return (array) $value;
		}
	}

Unit tests are needed, phpdoc would need updated as well. TPropertyValue has no unit tests. I have them written. But not for this, not yet.

@ctrlaltca
Copy link
Member

TPropertyValue::ensureArray() was originally meant to parse simple lists of items in templates.
Using it to parse data received from users is a guaranteed remote code execution.
Any kind of workaround applied to eval() can be easily defeated by obscuring the cose, eg. if $value contains "new object()", "unserialize(whatever)", etc...
Replacing eval() with something else is really a nice improvement. The first alternative that i can think of is str_getcsv().
It won't handle complicate cases (eg. nested arrays or keyed arrays), but i guess we don't want to handle those anyway.
Right now the function is used only in a few places inside Prado, but unfortunately some of them seems to possibly need complicate cases, eg. TMemcache::setOptions().
I'd drop happily support for these cases if needed.

@belisoful
Copy link
Member Author

We do need some kind of nested array processing. "[[1, 2, 1], [2, 4, 2], [1, 2, 1]]" because I have an image convolution filter for a new publishing image processor behavior. The new TAsset class (and publishing) is bonkers awesome. I'm doing the test cases for it now. The new File publishing system is going to be a new key core feature. We can add our own watermarks and copyright to images on publish. ensureArray is used by the image convolution filter. That's why this method is getting some attention and love.

(Also, I'm working on file meta data... eg. scrubbing image metadata, adding our own meta data (like copyright and updated publish times!), and selecting which images to manipulate on publish based on meta data.)

I agree that ensureArray for user input is an edge case. I don't like the eval. Yes, exactly, there are 1001 ways to compromise ensureArray.

Also, we have to put quotes in the array to get it to work properly with strings. eg "('item1', 'item2', 'item3')" as opposed to "(item1, item2, item3, my "text" 'in' a long string, 'my "text",' in a string', "my 'text',"in" a string", (1, 2 , 3))".

My thought is that originally, when PRADO and TPropertyValue was created, PHP allowed constants as strings and so it was "OK" to do "(item1, item2, item3)". This, of course, is an error today.

@belisoful
Copy link
Member Author

belisoful commented Jan 1, 2023

I am having success with this Regular Expression in validating an array:

\((?P<inner>(?P<comma>(?P>rv),)*(?P<rv>\s*(?P<value>(?P<recur>(?R))|(?P<const>"(?:[^"]|(?<=\\)")*?"|'(?:[^']|(?<=\\)')*?'|[^'"\(\)\s][^\(\),]*?))?\s*))\)

php:
$regex = '/\((?P<inner>(?P<comma>(?P>rv),)*(?P<rv>\s*(?P<value>(?P<recur>(?R))|(?P<const>"(?:[^"]|(?<=\\\\)")*?"|\'(?:[^\']|(?<=\\\\)\')*?\'|[^\'"\(\)\s][^\(\),]*?))?\s*))\)/m';

it properly recurses the bracket, eg. '((((),()),()),())'. It handles numbers and strings (without quotes then the commas become terminators), strings with single or double quotes.

String that start with ' or " must terminate reciprocally. But strings with commas as terminators (aka, do not start with ' or ") may have the ' or " as the second or beyond characters without having to be balanced (in any way).

strings with commas as terminators cannot use the ( or ) characters for being the array syntax. To have a ( or ) in a string, single or double quotes around the string must be used. Without this, infinite loops happen.

"()", "(2)", "( 2 )", "(2, 3)", "(abc, xyz)", "(abc, 'efg', "xyz")"
('abc,"'()', "xyz,'"()")

errors:
(11, string with()out quotes, xyz)
('abc' 'xyz')

parsing and validating is the next step, recursive parsing makes sense. "true" to true, integers to int, numbers to float. in typing comma separated strings.

will this work? https://regex101.com/r/wJY9Wn/1

@belisoful
Copy link
Member Author

I'd like to add that 0x00ada123 should also be supported by the array processor. One of the image filters is an array of pixel colors that would be expressed as "(0x00FFFFFF, 0x7F000000, 0x3f808080)"

@belisoful belisoful changed the title TPropertyValue::ensureArray needs to handle "[...]" arrays besides "(...)" arrays TPropertyValue::ensureArray needs better "(...)" handling without eval Jan 23, 2023
@belisoful
Copy link
Member Author

The convolution filter has an array within array structure, so eg "((1, 2 , 1), (2, 4, 2), (1, 2, 1))". Which is not supported currently. it must be this to currently work: "([1,2,1], [2,4,2], [1,2,1])"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants