Skip to content
This repository has been archived by the owner on Apr 27, 2023. It is now read-only.

Third PR about formatting and wording... #31

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 32 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,57 @@
# microsoft-pdb
This repo contains information from Microsoft about the PDB (Program Database)

This repo contains information from Microsoft about the PDB (Program Database)
[Symbol File](https://msdn.microsoft.com/en-us/library/windows/desktop/aa363368(v=vs.85).aspx) format.

[WILL NOT currently build. There is a cvdump.exe till the repo is completed. pdb.h is in the langapi folder]
_THIS WILL NOT currently build. There is a `cvdump.exe` till the repo is completed. `[pdb.h](https://github.com/rdeforest/microsoft-pdb/blob/master/langapi/include/pdb.h)` is in the `langapi` folder_

The intent here is to provide code that will show all the binary level formats and simple tools that can use the pdb.

Simply put ...We will make best efforts to role this foward with the new compilers and tools that we ship every release. We will continue to innovate and change binary API's and ABI's for all the Microsoft platforms and we will try to include the community by keeping this PDB repo in synch with the latest retail products (compilers,linkers,debuggers) just shipped.
Simply put, we will make best efforts to roll this forward with the new compilers and tools that we ship every release. We will continue to innovate and change binary API's and ABI's for all the Microsoft platforms and we will try to include the community by keeping this PDB repo in synch with the latest retail products (compilers,linkers,debuggers) just shipped.

By publishing this source code, we are by passing the publically documented API we provided for only reading a PDB - that was DIA
https://msdn.microsoft.com/en-us/library/x93ctkx8.aspx
https://msdn.microsoft.com/en-us/library/x93ctkx8.aspx

With this information we are now building the information for other compilers (and tools) to efficiently write a PDB.
With this information we are now building the information for other compilers (and tools)
to efficiently write a PDB.

The PDB format has not been officially documented, presenting a challenge for other compilers and
toolsets (such as Clang/LLVM) that want to work with Windows or the Visual Studio debugger. We want
to help the Open Source compilers to get onto the Windows platform.
The majority of content on this repo is presented as actual source files from the VC++ compiler
toolset. Source code is the ultimate documentation :-) We hope that you will find it helpful. If you

The majority of content on this repo is presented as actual source files from the VC++ compiler
toolset. Source code is the ultimate documentation :-) We hope that you will find it helpful. If you
find that you need other information to successfully complete your project, please enter an
[Issue](https://github.com/microsoft/microsoft-pdb/issues) letting us know what information you need.

##Start here
The file pdb.h (on in langapi), provides the API surface for mscorpdb.dll, which we ship with every compiler and toolset.
## Start here
The file `[pdb.h](https://github.com/rdeforest/microsoft-pdb/blob/master/langapi/include/pdb.h)` provides the API surface for `mscorpdb.dll`, which we ship with every compiler and toolset.

Important points:

• Mscorpdb.dll is what our linker and compiler uses to create PDB files.
• Mscorpdb.dll implements the “stream” abstractions.
- `mscorpdb.dll` is what our linker and compiler uses to create PDB files.
- `mscorpdb.dll` implements the “stream” abstractions.

Also there is another file that we ship that should allow you to determine whether you have correctly produced an “empty” PDB which contains the minimal encoding to let another tool open and correctly parse that “empty” file. “Empty” really meaning a properl
y formated file where the sections contain the correct information to indicate zero records or symbols are present
A tool that I thought we also ship that would easily verify your “empty” PDB file is dia2dump.exe
Also there is another file that we ship that should allow you to determine whether you have correctly produced an “empty” PDB which contains the minimal encoding to let another tool open and correctly parse that “empty” file. “Empty” really meaning a properly
formated file where the sections contain the correct information to indicate zero records or symbols are present.
A tool that I thought we also ship that would easily verify your “empty” PDB file is `dia2dump.exe`.

So in summary, by using the externally defined function entry points in pdb.h you can call into mscorpdb.dll.
So in summary, by using the externally defined function entry points in `pdb.h` you can call into `mscorpdb.dll`.

##What is a PDB
## What is a PDB

PDBs are files with multiple ‘streams’ of information in them. You can almost assume each stream as an individual file, except that storing them as individual files is wasteful and inconvenient, hence this multiple streams approach. PDB streams are not NTFS streams though. They can be implemented as NTFS streams, but since they are to be made available on Win9X as well, they use a home brewed implementation. The implementation allows a primitive form of two-phase commit protocol. The writers of PDB files write what ever they want to in PDBs, but it won’t be committed until an explicit commit is issued. This allows the clients quite a bit of flexibility - say for example, a compiler can keep on writing information, and just not commit it, if it encounters an error in users’ source code.
PDBs are files with multiple ‘streams’ of information in them. You can almost treat each stream as an individual file, except that storing them as individual files is wasteful and inconvenient [citation needed], hence this multiple streams approach. PDB streams are not NTFS streams though. They can be implemented as NTFS streams, but since they are to be made available on Win9X as well, they use a home brewed implementation. The implementation allows a primitive form of two-phase commit protocol. The writers of PDB files write what ever they want to in PDBs, but it won’t be committed until an explicit commit is issued. This allows the clients quite a bit of flexibility. For example: a compiler can write information to a stream and not commit it until it has confirmed success.

Each stream is identified with a unique stream number and an optional name. In a nutshell here’s how the PDB looks like -

| Stream No. | Contents |Short Description
|--------------|---------------------------------|-------------------
| 1 | Pdb (header) | Version information, and information to connect this PDB to the EXE
| 2 | Tpi (Type manager) | All the types used in the executable.
| 3 | Dbi (Debug information) | Holds section contributions, and list of ‘Mods’
| 4 | NameMap | Holds a hashed string table
| 4-(n+4) | n Mod’s (Module information) | Each Mod stream holds symbols and line numbers for one compiland
| n+4 | Global symbol hash | An index that allows searching in global symbols by name
| n+5 | Public symbol hash | An index that allows searching in public symbols by addresses
| n+6 | Symbol records | Actual symbol records of global and public symbols
| n+7 | Type hash | Hash used by the TPI stream.
Each stream is identified with a unique stream number and an optional name. In a nutshell here’s what the PDB looks like:

| Stream No. | Contents | Short Description
|-------------|---------------------------|--------------------
| 0 | Pdb (header) | Version information, and information to connect this PDB to the EXE
| 1 | Tpi (Type manager) | All the types used in the executable.
| 2 | Dbi (Debug information) | Holds section contributions, and list of ‘Mods’
| 3 | NameMap | Holds a hashed string table
| n+4 | Mod n Module information | Each Mod stream holds symbols and line numbers for one compiland
| -4 | Global symbol hash | An index that allows searching in global symbols by name
| -3 | Public symbol hash | An index that allows searching in public symbols by addresses
| -2 | Symbol records | Actual symbol records of global and public symbols
| -1 | Type hash | Hash used by the TPI stream.