diff options
Diffstat (limited to 'impl/antlr/libantlr3c-3.4/ChangeLog')
-rw-r--r-- | impl/antlr/libantlr3c-3.4/ChangeLog | 550 |
1 files changed, 550 insertions, 0 deletions
diff --git a/impl/antlr/libantlr3c-3.4/ChangeLog b/impl/antlr/libantlr3c-3.4/ChangeLog new file mode 100644 index 0000000..c5540d9 --- /dev/null +++ b/impl/antlr/libantlr3c-3.4/ChangeLog @@ -0,0 +1,550 @@ +The following changes (change numbers refer to perforce) were +made from version 3.1.1 to 3.1.2 + +Runtime +------- + +Change 5641 on 2009/02/20 by jimi@jimi.jimi.antlr3 + + Release version 3.1.2 of the ANTLR C runtime. + + Updated documents and release notes will have to follow later. + +Change 5639 on 2009/02/20 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-356 + + Ensure that code generation for C++ does not require casts + +Change 5577 on 2009/02/12 by jimi@jimi.jimi.antlr3 + + C Runtime - Bug fixes. + + o Having moved to use an extract directly from a vector for returning + tokens, it exposed a + bug whereby the EOF boudary calculation in tokLT was incorrectly + checking > rather than >=. + o Changing to API initialization of tokens rather than memcmp() + incorrectly forgot to set teh input stream pointer for the + manufactured tokens in the token factory; + o Rewrite streams for rewriting tree parsers did not check whether the + rewrite stream was ever assigned before trying to free it, it is now + in line with the ordinary parser code. + +Change 5576 on 2009/02/11 by jimi@jimi.jimi.antlr3 + + C Runtime: Ensure that when we manufacture a new token for a missing + token, that the user suplied custom information (if any) is copied + from the current token. + +Change 5575 on 2009/02/08 by jimi@jimi.jimi.antlr3 + + C Runtime - Vastly improve the reuse of allocated memory for nodes in + tree rewriting. + + A problem for all targets at the moment si that the rewrite logic + generated by ANTLR makes no attempt + to reuse any resources, it merely gurantees that the tree shape at the + end is correct. To some extent this is mitigated by the garbage + collection systems of Java and .Net, even thoguh it is still an overhead to + keep creating so many modes. + + This change implements the first of two C runtime changes that make + best efforst to track when a node has become orphaned and will never + be reused, based on inherent knowledge of the rewrite logic (which in + the long term is not a great soloution). + + Much of the rewrite logic consists of creating a niilnode into which + child nodes are appended. At: rulePost processing time; when a rewrite + stream is closed; and when becomeRoot is called, there are many situations + where the root of the tree that will be manipulted, or is finished with + (in the case of rewrtie streams), where the nilNode was just a temporary + creation for the sake of the rewrite itself. + + In these cases we can see that the nilNode would just be left ot rot in + the node factory that tracks all the tree nodes. + Rather than leave these in the factory to rot, we now keep a resuse + stck and always reuse any node on this + stack before claimin a new node from the factory pool. + + This single change alone reduces memory usage in the test case (20,604 + line C program and a GNU C parser) + from nearly a GB, to 276MB. This is still way more memory than we + shoudl need to do this operation, even on such a large input file, + but the reduction results in a huge performance increase and greatly + reduced system time spent on allocations. + + After this optimizatoin, comparison with gcc yeilds: + + time gcc -S a.c + a.c:1026: warning: conflicting types for built-in function ‘vsprintf’ + a.c:1030: warning: conflicting types for built-in function ‘vsnprintf’ + a.c:1041: warning: conflicting types for built-in function ‘vsscanf’ + 0.21user 0.01system 0:00.22elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+240outputs (0major+8345minor)pagefaults 0swaps + + and + + time ./jimi + Reading a.c + 0.28user 0.11system 0:00.39elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+0outputs (0major+66609minor)pagefaults 0swaps + + And we can now interpolate the fact that the only major differnce is + now the huge disparity in memory allocations. A + future optimization of vector pooling, to sepate node resue from vector + reuse, currently looks promising for further reuse of memory. + + Finally, a static analysis of the rewrte code, plus a realtime analysis + of the heap at runtime, may well give us a reasonable memory usage + pattern. In reality though, it is the generated rewrite logic + that must becom optional at not continuously rewriting things that it + need not, as it ascends the rule chain. + +Change 5563 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Allow rewrite streams to use the base adaptors vector factory and not + try to malloc new vectors themselves. + +Change 5562 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Don't use CALLOC to allocate tree pools, use malloc as there is no need + for calloc. + +Change 5561 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Prevent warnigsn about retval.stop not being initialized when a rule + returns eraly because it is in backtracking mode + +Change 5558 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Lots of optimizations (though the next one to be checked in is the huge + win) for AST building and vector factories. + + A large part of tree rewriting was the creation of vectors to hold AST + nodes. Although I had created a vector factory, for some reason I never got + around to creating a proper one, that pre-allocated the vectors in chunks and + so on. I guess I just forgot to. Hence a big win here is prevention of calling + malloc lots and lots of times to create vectors. + + A second inprovement was to change teh vector definition such that it + holds a certain number of elements wihtin the vector structure itself, rather + than malloc and freeing these. Currently this is set to 8, but may increase. + For AST construction, this is generally a big win because AST nodes don't often + have many individual children unless there has not been any shaping going on in + the parser. But if you are not shaping, then you don't really need a tree. + + Other perforamnce inprovements here include not calling functions + indirectly within token stream and common token stream. Hence tokens are + claimed directly from the vectors. Users can override these funcitons of course + and all this means is that if you override tokenstreams then you pretty much + have to provide all the mehtods, but then I think you woudl have to anyway (and + I don't know of anyone that has wanted to do this as you can carry your own + structure around with the tokens anyway and that is much easier). + +Change 5555 on 2009/01/26 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-288 + Correct the interpretation of the skip token such that channel, start + index, char pos in lie, start line and text are correctly reset to the start of + the new token when the one that we just traversed was marked as being skipped. + + This correctly excludes the text that was matched as part of the + SKIP()ed token from the next token in the token stream and so has the side + effect that asking for $text of a rule no longer includes the text that shuodl + be skipped, but DOES include the text of tokens that were merely placed off the + default channel. + +Change 5551 on 2009/01/25 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-287 + Most of the source files did not include the BSD license. THis might + not be that big a deal given that I don't care what people do with it + other than take my name off it, but having the license reproduced + everywhere + at least makes things perfectly clear. Hence this mass change of + sources and templates + to include the license. + +Change 5550 on 2009/01/25 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-365 + Ensure that as soon as we known about an input stream on the lexer that + we borrow its string factroy adn use it in our EOF token in case + anyone tries to make it a string, such as in error messages for + instance. + +Change 5548 on 2009/01/25 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-363 + At some point the Java runtime default changed from discarding offchannel + tokens to preserving them. The fix is to make the C runtime also + default to preserving off-channel tokens. + +Change 5544 on 2009/01/24 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-360 + Ensure that the fillBuffer funtiion does not call any methods + that require the cached buffer size to be recorded before we + have actually recorded it. + +Change 5543 on 2009/01/24 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-362 + Some users have started using string factories themselves and + exposed a flaw in the destroy method, that is intended to remove + a strng htat was created by the factory and is no longer needed. + The string was correctly removed from the vector that tracks them + but after the first one, all the remaining strings are then numbered + incorrectly. Hence the destroy method has been recoded to reindex + the strings in the factory after one is removed and everythig is once + more hunky dory. + User suggested fix rejected. + +Change 5542 on 2009/01/24 by jimi@jimi.jimi.antlr3 + + Fixed ANTLR-366 + The recognizer state now ensures that all fields are set to NULL upon +creation + and the reset does not overwrite the tokenname array + +Change 5527 on 2009/01/15 by jimi@jimi.jimi.antlr3 + + Add the C runtime for 3.1.2 beta2 to perforce + +Change 5526 on 2009/01/15 by jimi@jimi.jimivista.antlr3 + + Correctly define the MEMMOVE macro which was inadvertently left to be + memcpy. + +Change 5503 on 2008/12/12 by jimi@jimi.jimi.antlr3 + + Change C runtime release number to 3.1.2 beta + +Change 5473 on 2008/12/01 by jimi@jimi.jimivista.antlr3 + + Fixed: ANTLR-350 - C runtime use of memcpy + Prior change to use memcpy instead of memmove in all cases missed the + fact that the string factory can be in a situation where overlaps occur. We now + have ANTLR3_MEMCPY and ANTLR3_MEMMOVE and use the two appropriately. + +Change 5471 on 2008/12/01 by jimi@jimi.jimivista.antlr3 + + Fixed ANTLR-361 + - Ensure that ANTLR3_BOOLEAN is typedef'ed correctly when building for + MingW + +Templates +--------- + +Change 5637 on 2009/02/20 by jimi@jimi.jimi.antlr3 + + C rtunime - make sure that ADAPTOR results are cast to the tree type on + a rewrite + +Change 5620 on 2009/02/18 by jimi@jimi.jimi.antlr3 + + Rename/Move: + From: //depot/code/antlr/main/src/org/antlr/codegen/templates/... + To: //depot/code/antlr/main/src/main/resources/org/antlr/codegen/templates/... + + Relocate the code generating templates to exist in the directory set + that maven expects. + + When checking in your templates, you may find it easiest to make a copy + of what you have, revert the change in perforce, then just check out the + template in the new location, and copy the changes back over. Nobody has oore + than two files open at the moment. + +Change 5578 on 2009/02/12 by jimi@jimi.jimi.antlr3 + + Correct the string template escape sequences for generating scope + code in the C templates. + +Change 5577 on 2009/02/12 by jimi@jimi.jimi.antlr3 + + C Runtime - Bug fixes. + + o Having moved to use an extract directly from a vector for returning + tokens, it exposed a + bug whereby the EOF boudary calculation in tokLT was incorrectly + checking > rather than + >=. + o Changing to API initialization of tokens rather than memcmp() + incorrectly forgot to + set teh input stream pointer for the manufactured tokens in the + token factory; + o Rewrite streams for rewriting tree parsers did not check whether the + rewrite stream + was ever assigned before trying to free it, it is now in line with + the ordinary parser code. + +Change 5567 on 2009/01/29 by jimi@jimi.jimi.antlr3 + + C Runtime - Further Optimizations + + Within grammars that used scopes and were intended to parse large + inputs with many rule nests, + the creation anf deletion of the scopes themselves became significant. + Careful analysis shows that + for most grammars, while a parse could create and delete 20,000 scopes, + the maxium depth of + any scope was only 8. + + This change therefore changes the scope implementation so that it does + not free scope memory when + it is popped but just tracks it in a C runtime stack, eventually + freeing it when the stack is freed. This change + caused the allocation of only 12 scope structures instead of 20,000 for + the extreme example case. + + This change means that scope users must be carefule (as ever in C) to + initializae their scope elements + correctly as: + + 1) If not you may inherit values from a prior use of the scope + structure; + 2) SCope structure are now allocated with malloc and not calloc; + + Also, when using a custom free function to clean a scope when it is + popped, it is probably a good idea + to set any free'd pointers to NULL (this is generally good C programmig + practice in any case) + +Change 5566 on 2009/01/29 by jimi@jimi.jimi.antlr3 + + Remove redundant BACKTRACK checking so that MSVC9 does not get confused + about possibly uninitialized variables + +Change 5565 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Use malloc rather than calloc to allocate memory for new scopes. Note + that this means users will have to be careful to initialize any values in their + scopes that they expect to be 0 or NULL and I must document this. + +Change 5564 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Use malloc rather than calloc for copying list lable tokens for + rewrites. + +Change 5561 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Prevent warnigsn about retval.stop not being initialized when a rule + returns eraly because it is in backtracking mode + +Change 5560 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Add a NULL check before freeing rewrite streams used in AST rewrites + rather than auto-rewrites. + + While the NULL check is redundant as the free cannot be called unless + it is assigned, Visual Studio C 2008 + gets it wrong and thinks that there is a PATH than can arrive at the + free wihtout it being assigned and that is too annoying to ignore. + +Change 5559 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + C target Tree rewrite optimization + + There is only one optimization in this change, but it is a huge one. + + The code generation templates were set up so that at the start of a rule, + any rewrite streams mentioned in the rule wer pre-created. However, this + is a massive overhead for rules where only one or two of the streams are + actually used, as we create them then free them without ever using them. + This was copied from the Java templates basically. + This caused literally millions of extra calls and vector allocations + in the case of the GNU C parser given to me for testing with a 20,000 + line program. + + After this change, the following comparison is avaiable against the gcc + compiler: + + Before (different machines here so use the relative difference for + comparison): + + gcc: + + real 0m0.425s + user 0m0.384s + sys 0m0.036s + + ANTLR C + real 0m1.958s + user 0m1.284s + sys 0m0.656s + + After the previous optimizations for vector pooling via a factory, + plus this huge win in removing redundant code, we have the following + (different machine to the one above): + + gcc: + 0.21user 0.01system 0:00.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+328outputs (0major+9922minor)pagefaults 0swaps + + ANTLR C: + + 0.37user 0.26system 0:00.64elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k + 0inputs+0outputs (0major+130944minor)pagefaults 0swaps + + The extra system time coming from the fact that although the tree + rewriting is now optimal in terms of not allocating things it does + not need, there is still a lot more overhead in a parser that is generated + for generic use, including much more use of structures for tokens and extra + copying and so on. I will + continue to work on improviing things where I can, but the next big + improvement will come from Ter's optimization of the actual code structures we + generate including not doing things with rewrite streams that we do not need to + do at all. + + The second machine I used is about twice as fast CPU wise as the system + that was used originally by the user that asked about this performance. + +Change 5558 on 2009/01/28 by jimi@jimi.jimi.antlr3 + + Lots of optimizations (though the next one to be checked in is the huge + win) for AST building and vector factories. + + A large part of tree rewriting was the creation of vectors to hold AST + nodes. Although I had created a vector factory, for some reason I never got + around to creating a proper one, that pre-allocated the vectors in chunks and + so on. I guess I just forgot to. Hence a big win here is prevention of calling + malloc lots and lots of times to create vectors. + + A second inprovement was to change teh vector definition such that it + holds a certain number of elements wihtin the vector structure itself, rather + than malloc and freeing these. Currently this is set to 8, but may increase. + For AST construction, this is generally a big win because AST nodes don't often + have many individual children unless there has not been any shaping going on in + the parser. But if you are not shaping, then you don't really need a tree. + + Other perforamnce inprovements here include not calling functions + indirectly within token stream and common token stream. Hence tokens are + claimed directly from the vectors. Users can override these funcitons of course + and all this means is that if you override tokenstreams then you pretty much + have to provide all the mehtods, but then I think you woudl have to anyway (and + I don't know of anyone that has wanted to do this as you can carry your own + structure around with the tokens anyway and that is much easier). + +Change 5554 on 2009/01/26 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-379 + For some reason in the past, the ruleMemozation() template had required + that the name parameter be set to the rule name. This does not seem to be a + requirement any more. The name=xxx override when invoking the template was + causing all the scope names derived when cleaning up in memoization to be + called after the rule name, which was not correct. Howver, this only affected + the output when in output=AST mode. + + This template invocation is now corrected. + +Change 5553 on 2009/01/26 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-330 + Managed to get the one rule that could not see the ASTLabelType to call + back in to the super template C.stg and ask it to construct hte name. I am not + 100% sure that this fixes all cases, but I cannot find any that fail. PLease + let me know if you find any exampoles of being unable to default the + ASTLabelType option in the C target. + +Change 5552 on 2009/01/25 by jimi@jimi.jimi.antlr3 + + Progress: ANTLR-327 + Fix debug code generation templates when output=AST such that code + can at least be generated and I can debug the output code correctly. + Note that this checkin does not implement the debugging requirements + for tree generating parsers. + +Change 5551 on 2009/01/25 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-287 + Most of the source files did not include the BSD license. THis might + not be that big a deal given that I don't care what people do with it + other than take my name off it, but having the license reproduced + everywhere at least makes things perfectly clear. Hence this mass change of + sources and templates to include the license. + +Change 5549 on 2009/01/25 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-354 + Using 0.0D as the default initialize value for a double caused + VS 2003 C compiler to bomb out. There seesm to be no reason other + than force of habit to set this to 0.0D so I have dropped the D so + that older compilers do not complain. + +Change 5547 on 2009/01/25 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-282 + All references are now unadorned with any type of NULL check for the + following reasons: + + 1) A NULL reference means that there is a problem with the + grammar and we need the program to fail immediately so + that the programmer can work out where the problem occured; + 2) Most of the time, the only sensible value that can be + returned is NULL or 0 which + obviates the NULL check in the first place; + 3) If we replace a NULL reference with some value such as 0, + then the program may blithely continue but just do something + logically wrong, which will be very difficult for the + grammar programmer to detect and correct. + +Change 5545 on 2009/01/24 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-357 + The bug report was correct in that the types of references to things + like $start were being incorrectly cast as they wer not changed from + Java style casts (and the casts are unneccessary). this is now fixed + and references are referencing the correct, uncast, types. + However, the bug report was wrong in that the reference in the bok to + $start.pos will only work for Java and really, it is incorrect in the + book because it shoudl not access the .pos member directly but shudl + be using $start.getCharPositionInLine(). + Because there is no access qualification in C, one could use + $start.charPosition, however + really this should be $start->getCharPositionInLine($start); + +Change 5541 on 2009/01/24 by jimi@jimi.jimi.antlr3 + + Fixed - ANTLR-367 + The code generation for the free method of a recognizer was not + distinguishing tree parsers from parsers when it came to calling delegate free + functions. + This is now corrected. + +Change 5540 on 2009/01/24 by jimi@jimi.jimi.antlr3 + + Fixed ANTLR-355 + Ensure that we do not attempt to free any memory that we did not + actually allocate because the parser rule was being executed in + backtracking mode. + +Change 5539 on 2009/01/24 by jimi@jimi.jimivista.antlr3 + + Fixed: ANTLR-355 + When a C targetted parser is producing in backtracking mode, then the + creation of new stream rewrite structures shoudl not happen if the rule is + currently backtracking + +Change 5502 on 2008/12/11 by jimi@jimi.jimi.antlr3 + + Fixed: ANTLR-349 Ensure that all marker labels in the lexer are 64 bit + compatible + +Change 5473 on 2008/12/01 by jimi@jimi.jimivista.antlr3 + + Fixed: ANTLR-350 - C runtime use of memcpy + Prior change to use memcpy instead of memmove in all cases missed the + fact that the string factory can be in a situation where overlaps occur. We now + have ANTLR3_MEMCPY and ANTLR3_MEMMOVE and use the two appropriately. + +Change 5387 on 2008/11/05 by parrt@parrt.spork + + Fixed x+=. issue with tree grammars; added unit test + +Change 5325 on 2008/10/23 by parrt@parrt.spork + + We were all ref'ing backtracking==0 hardcoded instead checking the + @synpredgate action. + + |