Reverse engineering Delphi and Turbo Pascal unit interfaces (and maybe DCP files too)
Posted by jpluimers on 2020/10/07
Boy, I wish there was both an Embarcadero sanctioned grammar (see Delphi code completion fail with anonymous methods – Stack Overflow) and a DCU parser.
This might work for DCP files as well, since the PKX0
signature at the start of DCP files is in [WayBack] DCU32INT/DCP.pas at master · rfrezino/DCU32INT · GitHub.
Being able to dump DCP files makes it way easier to create documenting a matrix of all DCP files and units, to their interdependencies and containments become clear (including any unit scopes).
Right now that is only documented from the unit to the package on the page of the unit (see for instance [WayBack] System.SysUtils – RAD Studio API Documentation), not the other way around. This is a pain to select which packages you need in your project when building with packages.
The list at [WayBack] Unit List – RAD Studio API Documentation (which actually is an “Alphabetical list of unit scopes, along with miscellaneous units that have no unit scope.” is only partially helpful, especially as for instance the System unit page at [WayBack] System – RAD Studio API Documentation is 90% about the System
unit scope, has the System
unit itself about a 3rd down and does not mention it lives in the rtl.dcp
package.
The list at [WayBack] Deciding Which Runtime Packages to Use – RAD Studio is even worse than the unit list, as it misses many useful packages (like dsnap
)
For my link archive:
- [WayBack] Delphi XE: Reading basic information from a DCP-File | Sparetime-Development
- [WayBack] Delphi-PRAXiS – Einzelnen Beitrag anzeigen – Delphi XE’s DCP format: Auslesen einiger Basisinformationen
- [WayBack] On the Delphi *.dcp files | The Programming Works
- On package build order when doing DEBUG and RELEASE builds together
- [WayBack] How to get list of units in a Delphi Compiled Package (.dcp file) – Stack Overflow
- [WayBack] Units are compiled into DCU’s. If the sourcecode for a unit is not available, then the compiler uses the DCU instead… – Johan Bontes – Google+ Is there a way/tool to extract/reconstruct the interface part of a unit from the dcu?
- [WayBack] Compiler Design – Symbol Table
- [WayBack] GitHub – RomanYankovsky/DelphiAST: Abstract syntax tree builder for Delphi
- [WayBack] GitHub – rfrezino/RFindUnit: Replace the Delphi FindUnit uses DCU32INT
- [WayBack] DCU32INT: Delphi Compiled Units Parser
- Based on FlexT:
- [WayBack] Borland Delphi (vv. 2.0-9.0), Kylix (vv. 1.0-3.0) Unit
- [WayBack] FlexT / [WahBack] FlexT main page
- [WayBack] Catalog of FlexT format specifications for reverse engineering binary file formats
- [WayBack] Data description language FlexT: flexible types for description of static data.
- [WayBack] DCU32INT FAQ.
- [WayBack] Linux executable in TGZ
- [WayBack] Windows executable in ZIP
- [WayBack] DCU32INT sources in ZIP
- [WayBack] DCU32INT.txt readme
- Based on FlexT:
- Way old, but historic relevant: DCU2PAS, DCP2DPK and the ancient TPUINFO (by J.P.Ritchie), via:
Johan wanted to create a compiler symbol table from the binary DCU files (unlike DelphiAST which does it from the Pascal source files).
From the pre-Delphi era, I found back some info from my own archive:
In the Turbo Pascal days, you had TW1UNA and TPUUNA by William L. Peavy, which I think led to INTRFC from Duncan Murdoch (or maybe vice versa) which got updated to Turbo/Borland Pascal 7 format by Milan Dadok (see [Wayback/Archive] http://sources.ru/pascal/hacker/intrfc70.htm). Since the basic format of DCU files is very similar to that, my guess is that DCU32INT built on that.
Later I found [Wayback/Archive] The Programmer’s Corner » TPU60C.ZIP » Pascal Source Code also by William L. Peavy and Wayback/Archive] Duncan Murdoch’s Programs .
Edit 20220621:
- moved the
www8.pair.com
links tomurdoch-sutherland.com
- added more Wayback and Archive links
–jeroen
Leave a Reply