Support precise type inference for element indexing #104

LPTK · 2022-04-21T10:15:52Z

The goal is to support precise type inference for the indexing operator a[i].

For instance, if a turns out to have type Array<int>, then a[0] should have type int | undefined; but if a[0] turns out to have a more precise type such as [int, bool, nat], then a[0] should have type int.

(Note: the ... | undefined return type for imprecise indexing type inference is in the process of being implemented in mahzoun99#1.)

The way to do this is as follows.

We should add some meta information in type variables, in addition to upper and lower bounds:

A list of "indexed in" info (A, R), which specifies what type the type variable is used to index into A and that the type resulting from this operation is R.
A list of "indexed by" info (T, R), which specifies what type the type variable is indexed into by some concrete T and that the type resulting from this operation is R.

For example, if a: ?A and i: ?I , then typing a[i] will result in creating a new type variable ?R and adding (?A, ?R) to the "indexed in" info of type variable ?I.

Then, the job of type inference is simply making sure that all ?I::(?A, ?R) and ?A::(?I, ?R) info remain consistent, by propagating concrete bounds that are added to the relevant type variables.

More specifically: when constraining a type variable with a new concrete lower bound B in the rec function of ConstraintSolver.scala, we should check:

its "indexed in" info. For each such (?A, ?R), we should add (B, ?R) to it, and while doing so we should make sure that this (B, ?R) is consistent with the existing lower bounds in ?A;
its "indexed by" info. For each such (T, ?R), we should make sure that B is consistent with it.

Example: if add a lower bound 1 | 2 to type variable I0 which has "indexed in" info of (?A0, ?R0), then I add (1 | 2, ?R0) as "indexed by" into in ?A0. Then, if [S, T, U] is a lower bound on?A0, I should look at what the index 1 | 2 yields from that tuple, in this case T and U, and add these as lower bounds in R0.

There is one subtlety to deal with, though: if the (?A, ?R) info we want to register into some ?I has a higher level than ?A, we need to extrude its components to the correct level, using the extrude method.

The text was updated successfully, but these errors were encountered:

LPTK · 2022-04-27T01:39:15Z

Actually a simpler approach may be possible, although not completely equivalent. But it would be a cleaner first step.

The idea is to just introduce a new form of function-like type, call it IndexType(idx: SimpleType, res: SimpleType), written [idx] ~> res. When typing a[i], just create a fresh ?R and constrain lhs <! [i_ty] ~> ?R. Then solve the constraints in the obvious way.

We get ambiguous constraints when intersected indexed types appear on the LHS, but these will not actually arise from type inference, given the scheme above.

Matheart · 2022-07-30T17:36:46Z

can i say for the indexed in case, one of i's upper bounds must be int, and string for the indexed by case?

Matheart · 2022-07-31T04:51:09Z

So the main algorithm is as follows:

We handle the array and record cases separately, where we should write two functions IndexedIn (for array) and IndexedBy (for record) inside the ConstraintSolver class in ConstraintSolver.scala.
These two functions should pass extra meta information to allow more accurate type inference. We should not change this information unless in the rec function where we need to use the info to add ty to TypeProvence. We should also pay attention to levels because incorrect levels may result in unexpected error.

So I only need to write code in the two new functions, also inside the rec function, is my understanding correct?

I still have one question, where should we call the two functions?

Matheart · 2022-08-01T08:15:43Z

For the other algorithm you mentioned, do I still need to write the two functions?

LPTK · 2022-08-01T09:28:15Z

can i say for the indexed in case, one of i's upper bounds must be int, and string for the indexed by case?

We handle the array and record cases separately, where we should write two functions IndexedIn (for array) and IndexedBy (for record)

No, that's not at all the semantics I proposed above. Please read again:

A list of "indexed in" info (A, R), which specifies what type the type variable is used to index into A and that the type resulting from this operation is R.

A list of "indexed by" info (T, R), which specifies what type the type variable is indexed into by some concrete T and that the type resulting from this operation is R.

Also for now let's only focus on arrays.

2. These two functions should pass extra meta information to allow more accurate type inference. We should not change this information unless in the rec function where we need to use the info to add ty to TypeProvence. We should also pay attention to levels because incorrect levels may result in unexpected error.

So I only need to write code in the two new functions, also inside the rec function, is my understanding correct?

It seems you have forgotten or you misremember the explanations I gave you in our last meeting. Did you take notes? I proposed op add the "indexed by" and "indexed in" lists as mutable var fields in TypeVariable, and treat them accordingly when constraining type variables (in the case when new upper/lower bounds are registered).

For the other algorithm you mentioned, do I still need to write the two functions?

Let's not consider that one for now.

Matheart · 2022-08-16T15:56:47Z

Things to be discussed in the meeting tmr (just want to drop it here so more convenient to take look)

The Remaining Cases:
(1). [1 + 1]: seems it goes into case (_, t: TypeVariable) but idk why implementation is missing at the end

(2). Defined things but not being assigned a value for string, float, etc... , I found (_, TypeRef(TypeName("string") can be only used to match this instead of literal string. To match literal string, I need to write (_, ClassTag(StrLit(_), _). So is it necessary for me to write both of them (which makes code look clumsy)

(3). For the foo case the type inference is still wrong: foo: Array[~undefined] -> nothing. I guess this might be related with my wrong implementation in the extrude and freshen functions. Also I have not done this part yet: if the (?A, ?R) info we want to register into some ?I has a higher level than ?A, we need to extrude its components to the correct level, using the extrude method.

(4). How to match this casedef haha2 : Array[int] haha2[0], and also for string, ... etc

The Report:
take a look at the current draft (finished the part before current problem and solution), and maybe take a look at the complete draft later this week

LPTK added the enhancement New feature or request label Apr 21, 2022

LPTK assigned Meowcolm024 Apr 21, 2022

LPTK assigned Matheart and unassigned Meowcolm024 Jul 19, 2022

LPTK added this to MLscript backlog Aug 31, 2022

LPTK moved this to 🆕 New in MLscript backlog Aug 31, 2022

CAG2Mark moved this from 🆕 New to 💔 Stalled in MLscript backlog Aug 1, 2024

CAG2Mark moved this from 💔 Stalled to 🆕 New in MLscript backlog Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support precise type inference for element indexing #104

Support precise type inference for element indexing #104

LPTK commented Apr 21, 2022

LPTK commented Apr 27, 2022 •

edited

Loading

Matheart commented Jul 30, 2022

Matheart commented Jul 31, 2022 •

edited

Loading

Matheart commented Aug 1, 2022 •

edited

Loading

LPTK commented Aug 1, 2022

Matheart commented Aug 16, 2022

Support precise type inference for element indexing #104

Support precise type inference for element indexing #104

Comments

LPTK commented Apr 21, 2022

LPTK commented Apr 27, 2022 • edited Loading

Matheart commented Jul 30, 2022

Matheart commented Jul 31, 2022 • edited Loading

Matheart commented Aug 1, 2022 • edited Loading

LPTK commented Aug 1, 2022

Matheart commented Aug 16, 2022

LPTK commented Apr 27, 2022 •

edited

Loading

Matheart commented Jul 31, 2022 •

edited

Loading

Matheart commented Aug 1, 2022 •

edited

Loading