Thursday, March 1, 2012

Perforce Python API Basics

Many Python users working with Perforce believe that calling out to "p4.exe" with subprocess is the only method available. Perforce actually maintains free, native API packages for several languages, including Python. The Perforce Python API is fast, fully-featured and easy to work with. It lets you interact with Perforce in a familiar Python manner, without having to capture and parse command-line output. Parsing output is one of my least favorite things to do, and I doubt I'm alone there.

Here is a dead-simple example, showing how to use the Perforce Python API to sync all files in a certain depot folder.

# Sync contents of a folder
import P4

p4_api = P4.P4( )
p4_api.connect( )

results = p4_api.run_sync( '//project_x/...' )

p4_api.disconnect( )
Some notes on connections... You'll notice above I first "connected" before issuing any commands with the API. Typically you do this once in your tool/script, run any Perforce commands you need, then disconnect when you're finished or the tool closes. It will also disconnect when the API object falls out of scope and gets destroyed. There's no need to open and close the connection all the time.

You can also use the "with" statement to easily manage the connection, automatically disconnecting when that block of code is completed:
with p4_api.connect( ):
   # connected here
   results = p4_api.run_sync( '//project_x/...' )
# disconnected here
Going back to the top example... as written it will simply use the default Perforce port, client and user. If you want to explicitly set this and not use the default, call the "set_env" function prior to your connect call (new in version 2011.1):
p4_api.set_env = ( 'P4CLIENT', 'my_workspace' )
Next, take a look at the "run_sync" command we issued. One cool thing about the Perforce Python API is that the general syntax for everything is "<api_object>.run_<command>( args )", where "command" is literally the command string you would pass to p4.exe when using the command-line interface. Examples: "run_sync", "run_edit", "run_add", "run_fstat", etc. If you know how to use Python from the command-line you already know how to use the Python API.

Above you'll see I captured the return value of our sync as "results". Calls like this all return a single list of dictionaries, one dict for each file the operation was run on. In the sync example above, it only has to update two files in my workspace, so the results object returned looks like this:
[
   {
      'totalFileSize': '5299712',
      'rev': '319',
      'totalFileCount': '2',
      'clientFile': 'D:\\projects\\project_x\\stuff.dll',
      'fileSize': '4865024',
      'action': 'updated',
      'depotFile': '//project_x/stuff.dll',
      'change': '969310'
   },

   {
      'action': 'updated',
      'clientFile': 'D:\\projects\\project_x\\foo.txt',
      'rev': '134',
      'depotFile': '//project_x/foo.txt',
      'fileSize': '434688'
   }
]
Looking at the second dictionary at the bottom, you'll notice several keys indicating data from the sync operation for that file, including the action ("updated", "added", etc.), both the client and depot paths to the file, and its new revision number.

For some operations the first dictionary returned contains some extra keys related to the overall operation, such as the total number of files acted on, and their total sizes on disk.

These returned results are full of any data you need to present friendly messages to your users. Being in simple dictionary form means they're flexible and easy to work with.