<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Authoritative Opinion &#187; TODO</title>
	<atom:link href="http://authoritativeopinion.com/blog/category/todo/feed/" rel="self" type="application/rss+xml" />
	<link>http://authoritativeopinion.com/blog</link>
	<description></description>
	<lastBuildDate>Mon, 19 Jul 2010 00:04:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1-alpha</generator>
		<item>
		<title>Digital Asset Management for Public Broadcasting: Interlude</title>
		<link>http://authoritativeopinion.com/blog/2010/05/28/digital-asset-management-for-public-broadcasting-blacklight-interlud/</link>
		<comments>http://authoritativeopinion.com/blog/2010/05/28/digital-asset-management-for-public-broadcasting-blacklight-interlud/#comments</comments>
		<pubDate>Fri, 28 May 2010 18:10:12 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://authoritativeopinion.com/blog/2010/05/28/digital-asset-management-for-public-broadcasting-blacklight-interlud/">chris</span></dc:creator>
				<category><![CDATA[Repository]]></category>
		<category><![CDATA[TODO]]></category>

		<guid isPermaLink="false">http://authoritativeopinion.com/blog/?p=344</guid>
		<description><![CDATA[Just a quick update on my progress developing a shareable prototype. The basic integration work is functional, I&#8217;ve ripped out the previously-mentioned Camel workflow components in favor of ruote (which is so much easier to wrap my mind around &#8212; I&#8217;ve pushed the skeleton code for this out as a separate package called fedora-workflow), and [...]]]></description>
			<content:encoded><![CDATA[<p>Just a quick update on my progress developing a shareable prototype. The basic integration work is functional, I&#8217;ve ripped out the previously-mentioned Camel workflow components in favor of ruote (which is so much easier to wrap my mind around &#8212; I&#8217;ve pushed the skeleton code for this out as a separate package called <a href="http://github.com/cbeer/fedora-workflow">fedora-workflow</a>), and I&#8217;ve started doing some very basic datastream display work.</p>
<p>After this work is complete, I think a first-round alpha will be ready to publish within the next couple weeks.</p>
<div class='wp_likes' id='wp_likes_post-344'><a class='like' href="javascript:wp_likes.like(344);" title='' ><img src="http://authoritativeopinion.com/blog/wp-content/plugins/wp-likes/images/like.png" alt='' border='0'/>Like</a><span class='text'></span>
<div class='unlike'><a href="javascript:wp_likes.unlike(344);">Unlike</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://authoritativeopinion.com/blog/2010/05/28/digital-asset-management-for-public-broadcasting-blacklight-interlud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Digital Asset Management for Public Broadcasting: Blacklight (Part 3 of ??)</title>
		<link>http://authoritativeopinion.com/blog/2010/05/10/digital-asset-management-for-public-broadcasting-blacklight-part-3-of/</link>
		<comments>http://authoritativeopinion.com/blog/2010/05/10/digital-asset-management-for-public-broadcasting-blacklight-part-3-of/#comments</comments>
		<pubDate>Mon, 10 May 2010 22:03:28 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://authoritativeopinion.com/blog/2010/05/10/digital-asset-management-for-public-broadcasting-blacklight-part-3-of/">chris</span></dc:creator>
				<category><![CDATA[Repository]]></category>
		<category><![CDATA[TODO]]></category>
		<category><![CDATA[blacklight]]></category>
		<category><![CDATA[digital asset management]]></category>
		<category><![CDATA[fedora]]></category>

		<guid isPermaLink="false">http://authoritativeopinion.com/blog/?p=337</guid>
		<description><![CDATA[In the previous parts, I wrote about two &#8220;back-office&#8221; open source applications (and tangentially discussed a few others) that are well-established in their communities and can support a wide variety of repository services. While it may be philosophically important that these are open source applications, I would argue that the next parts, in which I [...]]]></description>
			<content:encoded><![CDATA[<p>In the previous parts, I wrote about two &#8220;back-office&#8221; open source applications (and tangentially discussed a few others) that are well-established in their communities and can support a wide variety of repository services. While it may be philosophically important that these are open source applications, I would argue that the next parts, in which I want to talk about services and applications on top of the repository infrastructure, are the more crucial and benefit tremendously from the ability to create and customize interfaces for specific use cases to the full extent necessary by anyone with a fairly broad skill-set.</p>
<p><a href="http://projectblacklight.org">Blacklight</a> grew out of a next-generation library catalog interface, and while it still has very firm roots in the library world, it is also being used for archives, digital collections, and institutional repository interfaces. It is also an open source application, based on the Ruby on Rails framework.</p>
<p>Out of the box, it is a fairly generic interface to a solr index (with a little sprinkling of optional MARC data) and some relatively benign application features (users, bookmarks, saved searches). Connecting it to our existing Solr index is fairly trivial, and just requires some little configuration changes:</p>
<pre name="code" class="ruby">
config[:index_fields] = {
    :field_names =&gt; [
      "dc.description",
      "dc.creator",
      "dc.publisher",
      "dc.subject",
      "dc.date",
      "dc.format"
    ],
    :labels =&gt; {
      "dc.description"           =&gt; "Description:",
      "dc.creator" =&gt; "Creator:",
      "dc.publisher" =&gt; "Publisher:",
      "dc.subject" =&gt; "Subject:",
      "dc.date" =&gt; "Date:",
      "dc.format" =&gt; "Format:"
    }
  }
</pre>
<p>Which gives you a very basic discovery interface into your collection.</p>
<p>Extending Blacklight to work with Fedora is also easy, so in less than 50 lines of code, I had full access to the Fedora web services APIs and SPARQL interface. Adding management interfaces was also simple, using normal Ruby of Rails techniques and with less than 500 lines of code, a passable repository manager interface was available and I could import assets and metadata.</p>
<p>Adding a security layer on top of the repository content is also easy, thanks to the work the UPEI team put into the <a href="http://www.fedora-commons.org/confluence/display/ISLANDORA/Islandora+Guide#IslandoraGuide-DrupalServletFilter">DrupalServletFilter</a>, which allows Fedora to authenticate users against any SQL database. Because of this, we can use the XACML policy language built into Fedora to do record-level security (which I confess, I don&#8217;t entirely understand, however, it is an enormously powerful and expressive language if you like XML verbiage). For storing re-use rights, I am very intrigued by <a href="http://odrl.net">the Open Digital Rights Language</a>, which can integrate with Fedora and Blacklight to express non-object-security rights (re-use, segmentation, etc) using my proof-of-concept <a href="http://github.com/cbeer/ruby-odrl">ruby-odrl</a>.</p>
<p>With these fundamentals in place (ingest services, security policies, and resource discovery), one can build more advanced services on top of the repository, like collections, batch and on-demand conversion/transcode services, export/transfer services (one-click &#8220;export to PBS COVE&#8221;?) &#8212; and, because this can be done as rails plug-ins, they are readily sharable outside of this single application and provide templates for others to continue to develop and extend similar services to evolving platforms.</p>
<p>Because setting up a Blacklight application is so painless, it would be easy for public broadcasting institutions to create custom-made (yet shareable) modules and views for specific purposes (news, productions, archiving, etc) that all share the same back-end infrastructure yet offer users an easy way to interact with their data in a way that makes sense for their work. As I mentioned in <a href="http://authoritativeopinion.com/blog/2010/05/04/digital-asset-management-for-public-broadcasting-fedora-commons-repository-part-1-of/">my Fedora article</a>, you aren&#8217;t limited to data you control and have locally, but can bring in data from external sources (say, pulling in metadata from the NPR API or an RSS feed from a stock footage house) and present it both coherently and cohesively.</p>
<p>I&#8217;m looking for a good source of freely available test data, and I would rather not invest too much time building a corpus of archival assets if there is something already existing. The biggest challenge I&#8217;m having is finding comprehensive metadata, but the closest I&#8217;ve come are some podcast feeds from sources like Democracy Now!, however that doesn&#8217;t capture the breadth of materials I&#8217;d like to demonstrate.</p>
<p>Finally, a couple requisite screen-shots now that there is something visual to work with, using the default Blacklight theme with some quick interface hacks.</p>

<a href='http://authoritativeopinion.com/blog/2010/05/10/digital-asset-management-for-public-broadcasting-blacklight-part-3-of/screen-shot-2010-05-10-at-9-12-21-am/' title='Screen shot 2010-05-10 at 9.12.21 AM'><img width="150" height="150" src="http://authoritativeopinion.com/blog/wp-content/uploads/2010/05/Screen-shot-2010-05-10-at-9.12.21-AM-150x150.png" class="attachment-thumbnail" alt="Screen shot 2010-05-10 at 9.12.21 AM" title="Screen shot 2010-05-10 at 9.12.21 AM" /></a>
<a href='http://authoritativeopinion.com/blog/2010/05/10/digital-asset-management-for-public-broadcasting-blacklight-part-3-of/screen-shot-2010-05-10-at-9-05-14-am/' title='Screen shot 2010-05-10 at 9.05.14 AM'><img width="150" height="150" src="http://authoritativeopinion.com/blog/wp-content/uploads/2010/05/Screen-shot-2010-05-10-at-9.05.14-AM-150x150.png" class="attachment-thumbnail" alt="Screen shot 2010-05-10 at 9.05.14 AM" title="Screen shot 2010-05-10 at 9.05.14 AM" /></a>

<div class='wp_likes' id='wp_likes_post-337'><a class='like' href="javascript:wp_likes.like(337);" title='' ><img src="http://authoritativeopinion.com/blog/wp-content/plugins/wp-likes/images/like.png" alt='' border='0'/>Like</a><span class='text'></span>
<div class='unlike'><a href="javascript:wp_likes.unlike(337);">Unlike</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://authoritativeopinion.com/blog/2010/05/10/digital-asset-management-for-public-broadcasting-blacklight-part-3-of/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Digital Asset Management for Public Broadcasting: Solr (Part 2 of ??)</title>
		<link>http://authoritativeopinion.com/blog/2010/05/08/digital-asset-management-for-public-broadcasting-solr-part-2-of/</link>
		<comments>http://authoritativeopinion.com/blog/2010/05/08/digital-asset-management-for-public-broadcasting-solr-part-2-of/#comments</comments>
		<pubDate>Sat, 08 May 2010 14:29:34 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://authoritativeopinion.com/blog/2010/05/08/digital-asset-management-for-public-broadcasting-solr-part-2-of/">chris</span></dc:creator>
				<category><![CDATA[Repository]]></category>
		<category><![CDATA[TODO]]></category>
		<category><![CDATA[digital asset management]]></category>
		<category><![CDATA[solr]]></category>

		<guid isPermaLink="false">http://authoritativeopinion.com/blog/?p=335</guid>
		<description><![CDATA[The Lucene-based Apache Solr is an incredible platform for building decent search experiences with &#8212; especially compared to the &#8220;more traditional&#8221; database-driven approach with many SQL JOINs that it becomes difficult to efficiently add search features like stemming, ASCII-folding, term highlighting, facets, and synonyms which, I would argue, are essential parts of the discovery experience [...]]]></description>
			<content:encoded><![CDATA[<p>The Lucene-based <a href="http://lucene.apache.org/solr">Apache Solr</a> is an incredible platform for building decent search experiences with &#8212; especially compared to the &#8220;more traditional&#8221; database-driven approach with many SQL JOINs that it becomes difficult to efficiently add search features like stemming, ASCII-folding, term highlighting, facets, and synonyms which, I would argue, are essential parts of the discovery experience and you essentially get for free with Solr. Another benefit Solr provides is a foundation for many light-weight interfaces on top of a single index (or, across multiple indexes, because Solr enforces some decent scalability principles that make expanding to task-based indexes easier).</p>
<p>For a DAM project, each asset should appear in the search index with the basic layer of contributed metadata, relationships, metadata extracted from the assets, as well as the administrative metadata managed by Fedora. I would align the fields the the Dublin Core (and DCTerms) elements (which is probably all you can get users to contribute in any case). At this point, because legacy systems lack authority control, linked data, or otherwise, existing metadata is sparse, inaccurate, or limited, which means the entry-level bar is set pretty low, so targeting ease-of-use and metadata collection are the priorities. Eliding a lot of detail, here&#8217;s the skeleton schema:</p>
<pre name="code" class="xml">
  &lt;field name="id" type="string" indexed="true" stored="true" required="true" /&gt;
   &lt;field name="title" type="string" indexed="true" stored="true" multiValued="true"/&gt;
   &lt;field name="description" type="string" indexed="true" stored="true"/&gt;

   &lt;dynamicField name="dc.*" type="string" indexed="true" stored="true" multiValued="true"/&gt;
   &lt;dynamicField name="dcterms.*" type="string" indexed="true" stored="true" multiValued="true"/&gt;
   &lt;dynamicField name="rdf.*" type="string" indexed="true" stored="true" multiValued="true"/&gt;
   &lt;field name="text" type="text" indexed="true" stored="false" multiValued="true"/&gt;
   &lt;field name="payloads" type="payloads" indexed="true" stored="true"/&gt;
   &lt;field name="timestamp" type="date" indexed="true" stored="true" default="NOW" multiValued="false"/&gt;

   &lt;copyField source="title" dest="title_t" /&gt;
   &lt;copyField source="subject" dest="dc.subject" /&gt;
   &lt;copyField source="description" dest="description_t" /&gt;
   &lt;copyField source="comments" dest="text" /&gt;
   &lt;copyField source="dc.creator" dest="author" /&gt;
   &lt;copyField source="dc.*" dest="text" /&gt;
   &lt;copyField source="text" dest="text_rev" /&gt;
   &lt;copyField source="payloads" dest="text" /&gt;

  &lt;copyField source="dc.title" dest="dc.title_t" /&gt;
  &lt;copyField source="dc.description" dest="dc.description_t" /&gt;
  &lt;copyField source="dc.coverage" dest="dc.coverage_t" /&gt;
  &lt;copyField source="dc.contributor" dest="dc.contributor_t" /&gt;
  &lt;copyField source="dc.subject" dest="dc.subject_t" /&gt;
  &lt;copyField source="dc.contributor" dest="names_t" /&gt;
  &lt;copyField source="dc.coverage" dest="names_t" /&gt;
</pre>
<p>The new <a href="https://issues.apache.org/jira/browse/SOLR-1553">edismax query parser</a> provides a great balance of flexibility, advanced query features, and ease-of-use that it seems like an obvious choice here.</p>
<p>The only penalty you pay by using solr is having to keep the solr index synchronized with your data sources. For synchronizing data from Fedora, there are now a proliferation of options, ranging from the task-specific with java plugins like <a href="http://www.fedora-commons.org/confluence/display/FCSVCS/Generic+Search+Service+2.2">GSearch</a> and <a href="http://github.com/mediashelf/shelver">Shelver</a> to the more generic (ESBs and all that) like <a href="http://camel.apache.org/">Apache Camel</a> or the Ruote-based <a href="http://github.com/cbeer/fedora-workflow">Fedora Workflow</a> component. Because DAM likely involves many different workflows, I lean towards the more generic solutions. Lately, I&#8217;ve given Camel a try, and after a couple days of java-dependency-induced head pounding, I have something that works.</p>
<p>&#8212;</p>
<p>On twitter, <a href="http://twitter.com/johntynan/status/13400294844">John Tynan requested</a> a virtual machine image to encourage others to begin playing with this software, so I&#8217;ve actually begun building some of these pieces. Currently, I have Fedora/Camel/Solr/Blacklight installed and functional, but before I try to package it us, I feel like I should add an easy-to-use ingest system to get data in. </p>
<div class='wp_likes' id='wp_likes_post-335'><a class='like' href="javascript:wp_likes.like(335);" title='' ><img src="http://authoritativeopinion.com/blog/wp-content/plugins/wp-likes/images/like.png" alt='' border='0'/>Like</a><span class='text'></span>
<div class='unlike'><a href="javascript:wp_likes.unlike(335);">Unlike</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://authoritativeopinion.com/blog/2010/05/08/digital-asset-management-for-public-broadcasting-solr-part-2-of/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Digital Asset Management for Public Broadcasting: Fedora Commons Repository (Part 1 of ??)</title>
		<link>http://authoritativeopinion.com/blog/2010/05/04/digital-asset-management-for-public-broadcasting-fedora-commons-repository-part-1-of/</link>
		<comments>http://authoritativeopinion.com/blog/2010/05/04/digital-asset-management-for-public-broadcasting-fedora-commons-repository-part-1-of/#comments</comments>
		<pubDate>Tue, 04 May 2010 23:33:48 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://authoritativeopinion.com/blog/2010/05/04/digital-asset-management-for-public-broadcasting-fedora-commons-repository-part-1-of/">chris</span></dc:creator>
				<category><![CDATA[Repository]]></category>
		<category><![CDATA[TODO]]></category>
		<category><![CDATA[digital asset management]]></category>
		<category><![CDATA[public broadcasting]]></category>

		<guid isPermaLink="false">http://authoritativeopinion.com/blog/?p=331</guid>
		<description><![CDATA[In my previous post, I provided a broad overview of the challenges and opportunities for developing an open source digital asset management system within the public broadcasting community, and described some fundamental technology that is already being developed and deployed within institutions. In this post, I want to look specifically at the role the Fedora [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://authoritativeopinion.com/blog/2010/05/03/digital-asset-management-for-public-broadcasting-part-0-of/">my previous post</a>, I provided a broad overview of the challenges and opportunities for developing an open source digital asset management system within the public broadcasting community, and described some fundamental technology that is already being developed and deployed within institutions. In this post, I want to look specifically at the role the <a href="http://fedora-commons.org">Fedora Commons repository architecture</a> can play in this environment. Additional reading is available from the Fedora Commons wiki, especially the <a href="http://www.fedora-commons.org/confluence/display/FCR30/Getting+Started+with+Fedora">Getting Start with Fedora</a> article, which articulates some of the strengths of their approach in the abstract.</p>
<p>The <a href="http://www.fedora-commons.org/confluence/display/FCR30/Fedora+Digital+Object+Model">Fedora Commons data model</a> is built on top of the <a href="http://www.cnri.reston.va.us/k-w.html">Kahn/Wilensky Architecture</a>, which describes a data structure for primary digital objects (irrespective of the data or formats contained within). Already, this is an improvement over some systems, which differentiate between content types, relegating some content formats to second-class citizenship. By providing a single, fundamental data type, one can build consistent user experiences on top of the discoverable components and interact with the digital objects to GET THINGS DONE.</p>
<p>Within digital objects are datastreams, which may include both data and metadata about the object, and are treated equally (more or less&#8230;) Datastreams can carry revision information, integrity checks, and other provenance information. By not distinguishing between &#8220;digital&#8221; assets (for which data (e.g. the media files) are available electronically) and other kinds of assets (physical tapes, abstract entities, etc), an asset management system can encompass the full range of materials within an active media archive.</p>
<p>Digital objects can be assigned content model types, which stipulate the required (and optional) component datastreams, as well as define the services that operate on objects of that type. These content types are simply structured digital objects within the repository, allowing repository managers (and content creators, given a sufficient interface) to define the structure of their content rather than structuring their content to meet the needs of the digital asset management system.</p>
<p>Types of datastreams natively supported include Inline XML datastreams, Managed Content, Externally Referenced Content, and Redirects. The datastream types do not speak to the format of content stored within them (except for inline XML), which allows content creators to easily provide content to the repository without first worrying about transcoding materials or other barriers to accessioning content (which is certainly not to say that standardizing content types archived within the repository is problematic &#8212; just that it shouldn&#8217;t interfere with getting the materials in the first place). This variety of types allows content to be stored and managed in the most appropriate places, rather than arbitrarily requiring centralization or &#8220;physical&#8221; ownership of content. Within a distributed organization like public broadcasting, this could be a powerful concept that allows content creators to control and manage their content at various stages of distribution (and, while this could be accomplished within traditional database driven systems, it would require custom application logic to do, which is likely not scalable across a wide variety of applications, frameworks, and languages). </p>
<p>While all datastreams are equal, there are four (or more?) that are more equal than others:</p>
<p>- AUDIT, which stores the history of the digital object as it is modified.</p>
<p>- DC, a Qualified Dublin Core datastream, that provides a minimal level of interoperability for the most generic of repository management interfaces. This is also the only fundamentally required datastream (without specifying required elements within it), and really is the bare minimum of information necessary to assert the existence of an object (if it doesn&#8217;t have a title, identifier, or description, what is it we&#8217;re talking about exactly?)</p>
<p>- RELS-EXT (and INT), an RDF-XML datastream in which one can assert relationship to other digital objects (which may exist within the repository, but may also exist (or not exist) elsewhere). These relationships can be from any vocabulary and reference any type of object, which is handy when you are dealing with complex relationships between media archives assets. This datastream is also generally indexed in an RDF triple-store to provide relationship querying.</p>
<p>- POLICY, which stores XACML security policies for the digital object, which can be used to restrict access to the datastreams, services, or the object based on whatever the security needs are. Within the digital asset management context, this could also be used to restrict access to only media files, while still providing the metadata (so one could assert and describe the existence of an object, without actually sharing it for whatever reason, which seems atypical for some commercial solutions)</p>
<p>By default, these datastreams (and the digital object wrapper) are stored on the file system in relatively comprehensible ways, which is a bonus to implementors who can set up underlying hardware or other technology in traditional ways and just begin to use the software without too much fuss. There is ongoing development to build in support for additional and evolving standards around digital object storage, serialization, access, and other services which should only help with making the process as transparent as possible.</p>
<p>All of this technology and flexibility comes &#8220;free&#8221; with the repository architecture and doesn&#8217;t try to interfere with actually making use of the assets (except as restricted by security policies, of course), which allows different use cases to be expressed in the most logical and straightforward way (rather than trying to bend the use cases or system in an attempt to mimic some of the elements the user needs). As a starting point for developing a digital asset management solution for media, I believe it offers a good balance of flexibility and requirements that can ensure user needs are met without sacrificing durability.</p>
<p>So, how can Fedora be applied in a digital asset management context for public broadcasting? First and foremost, Fedora provides a trusted platform for managing and maintaining content for many different contexts (production, long-term archiving, etc) on top of a variety of hardware and standards. By managing metadata and data together, physical and digital assets can be revealed in a common interface (when appropriate) to meet the needs of researchers and scholars (for whom the knowledge of the existence of the asset is more essential than on-demand access). Finally, by offering a stable API to a variety of resources, use-case driven interfaces can be developed, shared, and maintained to meet different needs sensibly.</p>
<div class='wp_likes' id='wp_likes_post-331'><a class='like' href="javascript:wp_likes.like(331);" title='' ><img src="http://authoritativeopinion.com/blog/wp-content/plugins/wp-likes/images/like.png" alt='' border='0'/>Like</a><span class='text'></span>
<div class='unlike'><a href="javascript:wp_likes.unlike(331);">Unlike</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://authoritativeopinion.com/blog/2010/05/04/digital-asset-management-for-public-broadcasting-fedora-commons-repository-part-1-of/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Digital Asset Management for Public Broadcasting (Part 0 of ?)</title>
		<link>http://authoritativeopinion.com/blog/2010/05/03/digital-asset-management-for-public-broadcasting-part-0-of/</link>
		<comments>http://authoritativeopinion.com/blog/2010/05/03/digital-asset-management-for-public-broadcasting-part-0-of/#comments</comments>
		<pubDate>Mon, 03 May 2010 23:52:33 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://authoritativeopinion.com/blog/2010/05/03/digital-asset-management-for-public-broadcasting-part-0-of/">chris</span></dc:creator>
				<category><![CDATA[Repository]]></category>
		<category><![CDATA[TODO]]></category>
		<category><![CDATA[digital asset management]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[public broadcasting]]></category>

		<guid isPermaLink="false">http://authoritativeopinion.com/blog/?p=328</guid>
		<description><![CDATA[Digital asset management is hard. Many people have solved many parts of the problem, but for a reasonably complex use-case, many of the existing solutions just aren&#8217;t there yet, especially within a vendor-driven world for a niche market within a niche market, which is concerned with all levels and life-cycles of an asset (from production, [...]]]></description>
			<content:encoded><![CDATA[<p>Digital asset management is hard. Many people have solved many parts of the problem, but for a reasonably complex use-case, many of the existing solutions just aren&#8217;t there yet, especially within a vendor-driven world for a niche market within a niche market, which is concerned with all levels and life-cycles of an asset (from production, to reuse, to archiving and back again), which is almost certainly not a profitable market given public broadcasting budgets. I believe this is an ideal area for the development of open source solutions based on some existing works of open source software.</p>
<p>The &#8220;easy&#8221; part in the DAM ecosystem, I would argue, is archiving the material and ensuring its long-term preservation (and accessibility!). I&#8217;ve done a couple projects and prototypes now based on the <a href="http://fedora-commons.org">Fedora Commons</a> repository architecture, and it seems to be a promising platform for this kind of development. Objects and datastreams are stored on the file-system, which IT staff are traditional prepared to manage (vs some unique database structure almost certainly obfuscated in layers of (de-)normalization). Fedora will happily manage security policies, object relationships, data transformation services, and (shortly) more advanced file system interactions, which exposing a (relatively) consistent HTTP interface.</p>
<p>Discovery interfaces are probably the next easiest piece, having been examined and developed out of the information sciences communities. Using a combination like Solr and Blacklight (deployed successfully for WGBH&#8217;s <a href="http://openvault.wgbh.org">Open Vault</a> website), one can rapidly create interfaces to the underlying content that satisfy the many use cases. With Solr, you get a bunch of discovery mechanisms and options, including relevancy, term highlighting, faceting, etc.</p>
<p>From here, we start getting into the hard parts. Ingest and metadata editing  is difficult to solve well in a content- and use-case- agnostic way, which is the approach most Systems seem to take. While the need for a generic asset management view is important (and solved!), if the collection of services fail to meet the needs of the users, encouraging adoption (nicely) is problematic. By using infrastructure elements with open and well-documented APIs, developers can extend and customize the user experiences to match the underlying data and processes. This is an area for which the adoption and support for open source projects can encourage sustainable development of these interfaces.</p>
<p>It seems like, after clearing these obstacles, many systems fail to account for the use and re-use of these objects within the media communities. Few systems account for batch encoding video and audio for web distribution, one-click publishing systems to blogs, social networking sites, or video portals, integration into broadcasting chains, etc &#8212; for very good reasons, there simply isn&#8217;t the incentive when faced with large upfront development costs for unique development. Given an open source platform, however, that supports (and encourages) sharable development of solutions, maybe we could start finding answers to these persistent problems (without re-inventing the wheel!).</p>
<p>I believe most of the core infrastructure pieces are there:<br />
- Fedora, as I mentioned, which provides preservation and management services;<br />
- Solr, which provides a discovery framework (and associated metadata extraction utilities like Tika);<br />
- Blacklight, which provides discovery and access services;<br />
- ESB or other workflow solutions like Camel, Ruote, or otherwise;<br />
- Generic metadata editing options, like XForms, Django, etc;<br />
- Open standards that allow for publishing and reuse (Atom, MediaRSS, RDF, ???);<br />
- FFMPEG, which offers encoding and transcode services.</p>
<p>It isn&#8217;t an extensive development problem, these are well-established communities in their fields, it&#8217;s a simple matter of getting initial momentum in tying the complex pieces together and creating interesting and useful services on top. </p>
<p>So, why aren&#8217;t we doing this? Money, time, lack of a collaborative/communicative culture, and apathy (and acceptance) of second-rate, buggy commercial solutions that fail to address all aspects of a media objects life-cycle as it goes from the rapid iterations in production to many different distribution channels back to relative obscurity in an archival context (until a new production pulls it out again). Without full support, no step in the process can realize the potential of the content and have the incentive to put in the hard work to ingest and describe the asset.</p>
<div class='wp_likes' id='wp_likes_post-328'><a class='like' href="javascript:wp_likes.like(328);" title='' ><img src="http://authoritativeopinion.com/blog/wp-content/plugins/wp-likes/images/like.png" alt='' border='0'/>Like</a><span class='text'></span>
<div class='unlike'><a href="javascript:wp_likes.unlike(328);">Unlike</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://authoritativeopinion.com/blog/2010/05/03/digital-asset-management-for-public-broadcasting-part-0-of/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>TODO: Map game</title>
		<link>http://authoritativeopinion.com/blog/2009/02/22/todo-map-game/</link>
		<comments>http://authoritativeopinion.com/blog/2009/02/22/todo-map-game/#comments</comments>
		<pubDate>Sun, 22 Feb 2009 20:35:41 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://authoritativeopinion.com/blog/2009/02/22/todo-map-game/">chris</span></dc:creator>
				<category><![CDATA[TODO]]></category>

		<guid isPermaLink="false">http://authoritativeopinion.com/blog/?p=21</guid>
		<description><![CDATA[I have created a quick prototype for a Google Maps mashup game. Users register and login using an OpenID login Users accumulate points Points are used to perform actions in the game world; examples of actions may include moving an object (which may be linked to the distance moved) or creating objects (which may have [...]]]></description>
			<content:encoded><![CDATA[<p>I have created a quick prototype for a <a href="http://ink.ratherinsane.com:81/">Google Maps mashup game</a>.</p>
<ol>
<li>Users register and login using an OpenID login</li>
<li>Users accumulate points</li>
<li>Points are used to perform actions in the game world; examples of actions may include moving an object (which may be linked to the distance moved) or creating objects (which may have a fixed cost)</li>
</ol>
<p>So, how can we take this concept and turn it into an actual game? First, we need to incentivize these actions (or, why does a player want to move). A first step would be to add resources, which then creates an trading economy and another action which may cost points. To check boundless resource creation, we need to add concepts of scarcity of resources. Second, we need to have a way of transforming the gathered resources into productive entities. Third, we need some sort of element of danger (combat, or what have you).</p>
<p>Finally, we need to move these abstract concepts into a game narrative.  My first intuition is to make this a post-apocalyptic kind of game, which could explain the structures, but lack of people, but it could easily be given some other interesting narrative. If anyone has thoughts on the narrative or design, I‚Äôd be interested in pursuing this further.</p>
</p>
<div class='wp_likes' id='wp_likes_post-21'><a class='like' href="javascript:wp_likes.like(21);" title='' ><img src="http://authoritativeopinion.com/blog/wp-content/plugins/wp-likes/images/like.png" alt='' border='0'/>Like</a><span class='text'></span>
<div class='unlike'><a href="javascript:wp_likes.unlike(21);">Unlike</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://authoritativeopinion.com/blog/2009/02/22/todo-map-game/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TODO: A lisp-backed MySQL Storage Engine</title>
		<link>http://authoritativeopinion.com/blog/2009/01/19/todo-a-lisp-backed-mysql-storage-engine/</link>
		<comments>http://authoritativeopinion.com/blog/2009/01/19/todo-a-lisp-backed-mysql-storage-engine/#comments</comments>
		<pubDate>Mon, 19 Jan 2009 20:44:42 +0000</pubDate>
		<dc:creator><span property="dc:creator" resource="http://authoritativeopinion.com/blog/2009/01/19/todo-a-lisp-backed-mysql-storage-engine/">chris</span></dc:creator>
				<category><![CDATA[TODO]]></category>

		<guid isPermaLink="false">http://authoritativeopinion.com/blog/?p=33</guid>
		<description><![CDATA[Today, I ran across an interesting problem: I have a large set (10GB) of sql queries that I wanted to filter so I could import a small subset of the data. Obviously, some kind of stream filtering would be appropriate, but because the INSERT queries were in large batches, it seemed to be a bigger [...]]]></description>
			<content:encoded><![CDATA[<p>Today, I ran across an interesting problem: I have a large set (10GB) of sql queries that I wanted to filter so I could import a small subset of the data. Obviously, some kind of stream filtering would be appropriate, but because the INSERT queries were in large batches, it seemed to be a bigger mental exercise than I wanted to tackle (mainly, writing a complex regular expression). After ruling that simplest approach out, I thought that lua-based <a href="http://forge.mysql.com/wiki/MySQL_Proxy"><b style="color:black;background-color:#a0ffff">mysql</b>-proxy</a> might be able to do the processing for me, providing a nice place in front of the database to do that filtering. Unfortunately, it doesn‚Äôt look like it will deal with these batch inserts in a friendly manner (which is probably a good thing, certainly for the proper use of <b style="color:black;background-color:#a0ffff">mysql</b>-proxy).</p>
<p>Finally, I thought about the nuclear option, a vastly over-engineered and rather insane solution: A pluggable <b style="color:black;background-color:#99ff99">storage</b> <b style="color:black;background-color:#ff9999">engine</b> that can take in the data and simply filter the material it chooses to store. I‚Äôm positive this breaks all manner of standards, but it would be a delightfully easy solution and give me a chance to play with the new <b style="color:black;background-color:#a0ffff">MySQL</b> plugin API. Sure, some people would hack the example c module to do these tasks,  nice and easy. Boring!</p>
<p>Sometime back, I ran across Johannes Schl√ºter‚Äôs <a href="http://www.schlueters.de/blog/archives/96-MySQL-Storage-Engine-based-on-PHP.html">PHP-based <b style="color:black;background-color:#a0ffff">MySQL</b> <b style="color:black;background-color:#99ff99">Storage</b> <b style="color:black;background-color:#ff9999">engine</b></a>. I come from a PHP background, so this would obviously be the easiest <b style="color:black;background-color:#ff9999">engine</b> to hack to my purposes. But this suffers from a serious lack of documentation, which, yes, could be overcome with a little work, but my first few attempts at compiling this ended in failure.</p>
<p>So, in an effort to add complexity, I realized that lisp/scheme might be a more efficient (read: unique) way to deal with this processing. To that end, I propose a lisp-backed MySQL storage engine (mainly, I also want to have a chance to tinker with lisp). I&#8217;ve gathered the documentation, I just need a chance to write some of the glue code.</p>
<ul>
<li><a href="http://bazaar.launchpad.net/~johannes-s/mysql-php-storage/trunk/files">Source code for the MySQL PHP storage engine</a></li>
<li><a href="http://forge.mysql.com/wiki/MySQL_Internals_Custom_Engine">MySQL documentation for custom storage engines</a>, actually a very interesting read</li>
<li><a href="http://docs.plt-scheme.org/inside/index.html">Inside: PLT Scheme C API</a>,which allows the interpreter to be extended by a dynamically-loaded library, or embedded within an arbitrary C/C++ program.</li>
</ul>
<p>So, all that is missing is another long weekend for hacking this together.. February, I suppose.</p>
</p>
<div class='wp_likes' id='wp_likes_post-33'><a class='like' href="javascript:wp_likes.like(33);" title='' ><img src="http://authoritativeopinion.com/blog/wp-content/plugins/wp-likes/images/like.png" alt='' border='0'/>Like</a><span class='text'></span>
<div class='unlike'><a href="javascript:wp_likes.unlike(33);">Unlike</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://authoritativeopinion.com/blog/2009/01/19/todo-a-lisp-backed-mysql-storage-engine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
