Monday, September 28, 2009

Virtual Render Farms


Last week, I was invited to participate in a VES panel discussion about "Virtual Render Farms", followed by a kick-ass BBQ, all in the old ILM facility on Kerner Blvd in San Rafael. While I'm sure the free food and beer had something to do with the turnout, there was a good-sized audience for the panel discussion itself.

Steve Mays, from X2 Technologies, lead off the discussion with a very in-depth analysis of the various technologies involved - including Teradici thin clients that could (ultimately) allow 2D and 3D artists to work remotely - and I mean, internationally-remotely, with all of the data stored within the datacenter. (That's a topic for another post: what would the fully-integrated remote production datacenter look like? But I digress...)

Alex Ethier
(shown below, with his dog, who also attended) gave a succinct analysis of the "production" perspective - which was generally that this would be a Good Thing all round.

But what does it mean, this idea of a "virtual" render farm? Well, there are several layers to that question - at the simplest level, running multiple OSes, each in a separate Virtual Machine (VM) allows studios to combine multiple functions on a single (multi-core) server, in isolated memory spaces. It's also possible to run VMs as an alternative to multi-booting a machine when different applications (e.g. Adobe After Effects, which runs on Mac and Windows, but not Linux) need to be distributed on the render farm. Yes, there's an overhead associated with that, but that overhead gets smaller each day - and with I/O virtualization, each subsequent bottleneck gets to be less and less of an issue. There are other benefits, as well - imagine multi-core, multi-head servers in a data center, with artists using thin clients on the desktops. Even if those artists are in the same facility as the data center, the cost-per-artist drops, along with up-front CapEx expenditures. With sufficient bandwidth available, some of those artists (or most, or all) could be working from home - or remote locations, wherever they happened to live. The next logical step is to have a central data center that could support multiple studios - further reducing the CapEx budgets (and thus, barriers to entry) for smaller teams of artists and programmers to expand to accomodate larger productions.

Now, there are any number of technical obstacles that still need to be overcome for this to become a reality - and some of that includes the pricing on some of these technologies. But as frequent readers of this blog will have already guessed, my focus - and the theme I touched on during the panel discussion - is on the social engineering aspects...

The biggest obstacle is this: almost none of the software packages used in Digital Media (e.g. Maya, Mental Ray, Renderman, etc.) have licensing agreements that permit this kind of use. I'm not talking about metered software - where a central facility (or the publisher itself) would rent time on a license, either. I'm talking about a corporate entity that owns licenses to that software, but can't run them on machines that are more than [n] miles from the address noted on the license (typically 5 miles or so). "Rendering on the cloud" is not the point - without legitimate software licenses for the rendering applications - and a pipeline/workflow that can support the data transmission costs/issues, it's not a real solution.

Speaking as the head of a software company, I understand the problem, here. All of our business models are built on licensing software to end-users; not metering it. But at PipelineFX, we've already embraced daily rentals of our software - and this has turned out to be extremely popular. I think this is, to some degree, an inevitable aspect of the software business - it will not be without some pain, as publishers struggle to create a business model that works for both sides, and the adoption of this model in general is dependent on the availability/affordability of the technology discussed earlier. But I also think it will open up new business models on the content-creation side of things - which can only be a good thing for artists and TDs and producers, as well as software publishers.

Wednesday, July 9, 2008

Random Musing on iTunes and music databases

Apropos of nothing, I've been reorganizing my iTunes collection - I work from home, and I have all my CDs ripped (Apple Lossless format) to a Linksys DNS-323 NAS (mirrored 500GB drives). The 323 has a built in iTunes server, and I connect to it with my Roku Soundbridge - it's great. Have you ever put your entire CD collection on "shuffle"? You get some very odd juxtipositions between songs, if your collection is anything like mine.

Anyway - having spent a considerable amount of time fixing meta data on all these songs, I have some gentle words of advice for the kind souls who enter meta data on gracenote or freedb...

1) Not everything is a compilation. Just because there's a guest artist on the album (e.g. Gwen Steffani sings along on one song on Moby's "Play") - doesn't mean it's a compilation. My modest proposal for a compilation is an album compiled of various artists - that has no other logical location in which to reside. "Play" is still a Moby album, right? You'd still file the physical disk under "Moby". This also applies to "Greatest Hits" albums - even 54-40's "Sweeter things - A Compilation" is not really a compilation in the sense that iTunes uses. And McCartney's "Band on the Run" just isn't a compilation in any sense of the word - but that's where iTunes put it, so somebody had to have thought that was appropriate.

2) Spelling counts. "Bruce Springstein"? Really?

3) A place for everything and everything in its place. Don't put the name of the artist in the Album title. Name your physical files however you like, but calling an album "U2 - All that you can't leave behind" just creates a new album, right next to the one called "All that you can't leave behind".

4) Guest artists should be listed per song. I know I'm out of step with a lot of people on this one - including MuchMusic - but when you label Peter Gabriel's "Shaking the Tree" as being "Peter Gabriel feat. Youssou N'Dour", you create an entirely separate directory for ONE SONG. If it's an album of duets, you'll create 10 separate directories - one for each song. My modest proposal: label the song. "Shaking the tree feat. Youssou N'Dour" is just as descriptive, and leaves the song filed under Peter Gabriel (where it belongs).

5) Multiple CD releases are still just one album. It's The Beatles "White Album" - it's not "White Album (CD1)" and "White Album (CD2)". (This actually raises some interesting issues about what constitutes "the album": the content, or the delivery mechanism? Or a combination of the two? Does the album experience change when you don't have to flip to side 2?) In any case, the disc number is not part of the title any more than the side of the vinyl LP was.

What does any of this have to do with pipelines? A lot, actually. When you have a system that's in use by a lot of people, that organizes data in such a way that people need to be able to find it, and you leave the organization up to the aggregate efforts of those people, you need very clearly and well-thought-out rules for organizing that data, or stuff will wind up all over the place.

Saturday, December 1, 2007

Possibility Spaces

Well, someone pointed out that I haven't updated this in a while (Hi, Biju!), and I did have a thought...

I was stuck in a middle seat, on a flight from Montreal to Vancouver yesterday, so I wound up playing a bunch of Solitaire on my Palm LifeDrive. I never really cared for the game - I thought it was like crossword puzzles, and entirely mechanical/deterministic. But it's not, and there are a bunch of different choices you make as the game goes on that can determine whether you win or lose (an "undo" function on my game let me back out of several losing games and follow a different path to win).

Anyway, it got me thinking about the different branches you can take in the game (possibility spaces), and ways to weight the winning branches, create a winning algorithm - standard fare for genetic algorithms.

That got me thinking about the possibility spaces for queuing algorithms - trying to maximize throughput of a system using a heuristic based on recurring types of meta data in the job submissions. You can't "undo" queuing decisions, but you can change them over time based on historical results, and optimize them. Not something super-relevant for a production queuing system, since there are usually far more production/political issues that need to be weighed, than strict efficiency concerns.

But that got me thinking about possibility spaces for business deals (and, really, personal interactions of all sorts) - where there is no undo, and the historical conditions from case to case are so different that it's hard to create a heuristic beyond "work with people you trust, and get it agreed to in writing - for everybody's sake".

In retrospect, I suppose that's only really interesting if you're currently weighing different business choices - but at least it's a new blog entry! More to follow.

Friday, May 11, 2007

How Would Google Render?

Here's another interesting post from Tim O'Reilly: "What Would Google Do"? Render farm management isn't a "Live" application (yet), but there are still plenty of lessons to be taken from Google, or Amazon. (Amazoogle!)

In my last post, I held Amazon up as the only real contender for an online compute/render service; Google would be the other obvious choice, though it doesn't offer a general computing service (yet).

But it's the history-tracking possibilities that interest me, more than the "Live" aspect of a web service. We already use a MySQL database behind Qube! - which lets you track the history of everything that goes through the farm. This kind of history is extremely useful for figuring out future bids and purchases: what was our peak/average utilization? How many iterations of each shot are we doing? What's our average render time? In short - where are we spending our time and money, and where are the bottlenecks?

And there's still more room for improvement. Optimal scheduling is a difficult problem, that can be made easier when you know more about the items in the queue: in particular, how long will they take to run? (Shortest job first is an easy/obvious optimization, but only if you know which jobs are the shortest.) But digital media is extremely iterative - even though the parameters of the shot will change between iterations, there's a lot of predictive information that can be passed from job to job. Texture references can be cached on the local disks of the render nodes - and their presence can be used as a weighting factor to favor jobs that can use them. "Hotspots" on central storage can be "cooled" by favoring jobs that don't draw from those areas - lots of possibilites.

The data is relatively easy to collect, and not overly difficult to incorporate into the decision-making process; the real trick is being able to build an interface that's simple enough to be generally useful. Queuing systems need to be easier to use, and easier to set up - and advanced functionality can't come at the expense of that useability...

Monday, April 30, 2007

A failure to communicate...

I look at a lot of grid/cluster/render farm services. I haven't seen a lot of big success stories on the media and entertainment front, for a number of reasons:

- Bandwidth: render jobs have relatively large input/output files, and the cost of moving the files back and forth can be relatively high, compared to the cost of computation.
- Security: most studios are very concerned about leaks of imagery or data; interestingly, I've heard of large studios that farm out work to smaller boutique studios - who then farm their renders out to render services...
- Cost: Visual FX comanies tend to be much more sensitive than other industries (why that's so will have to be the topic of another post); "grid" services that charge an $1/CPU-hour (bought in bulk, in advance) are prohibitively expensive.
- Lack of persistent storage: It's not efficient to upload separate copies of shared source files (referenced geometry and textures) for every render. Shared persistent storage is the optimal solution, but it's not available with all grid services - and when it is, there's an additional cost.
- Application licenses, and proprietary code: the cost of 3rd-party apps (e.g. Maya, Mental Ray, Renderman) frequently exceeds the cost of the hardware it runs on, the license EULAs for these apps usually prohibit any kind of rental or service usage, and there's no persistent connection (e.g. VPN) to share local licenses. Also, advanced render pipelines almost always include proprietary, site-specific code, which presents a similar problem.

Amazon's EC2 service may be the exception - the issue of application licenses remains, but Amazon's S3 storage service solves most of the storage issues. If Amazon could provide some kind of persistent VPN connection, we'd be set. Their costs (currently 10 cents per CPU-hour for computation, and 15 cents per GB/month of storage) are much more competitive; the transfer costs for S3 (10 cents per GB uploaded, 13-18 cents per GB downloaded) are more porblematic, but at least it's getting closer.

A bigger question is, what does this mean for the business models for storage, servers, and application software? In the short term, nothing. In the longer term, I think this is a business model that will have to be addressed...

Sunday, April 29, 2007

Synchronicity III

I was all set to write about the NAB trip (and subsequent LA visit) and the synchronicity of trade shows and business travel - sometimes things just seem to come together.

But this morning I was sitting out on the patio, keeping an eye on my daughter, playing in her new sandbox - just enjoying the sun, and catching up on some reading. And just having an hour or so of "down time" gave me more clarity, more ideas, and more opportunity to make 'connections' - in my head, and with what I'd been reading - than a week of business travel, trade shows, and meetings.

Friday, April 13, 2007

Design Failure

So, I'm headed back to Vegas for NAB, and I got asked to change my ticket from Sunday to Saturday, get there a day early for some meetings. Unfortunately, I bought my ticket through Cheap Tickets, and while I can change that flight, I would also have to change my flight the next day, because it's sold out...

"But I'm already booked on that flight..."
- "Yes sir, but I have to issue an entirely new itinerary, and I can't do that because your second flight is sold out."
"But I don't want to change my second flight - I'm travelling with someone on that flight..."
- "Then you can't change your first flight."

I talked to the airline, and they couldn't do anything - had to be changed by Cheap Tickets. So I said okay, I'll book another flight, on a different airline, and just eat the other flight. Did that, and called Cheap Tickets for cancel the first leg:

- "I'm sorry sir, this is a round-trip flight, and you can't cancel the first leg."
"Okay, well, I just won't be taking it - I'm taking another flight down the day before - but then I'll complete the rest of my ticket, as planned.
- "No sir, if you aren't on your first flight, we automatically cancel the rest of your flights."

How is it possible to do business like this? How can somebody have an interface so crippled, so useless, that it's not possible to make a simple change like this? It can't be that they don't want to be able to do it - they were going to charge $150 US in change fees.

Never assume malice when stupidity will suffice.