Posts by author hkbruegm

Overanalyzing

Sometimes the analytical phase of investigation produces useless information about the system.

http://consense-project.com/raw-attachment/blog/information-visualization-time/Infovis_time.jpg

source:  Scagnetti et al. (2008): Reshaping Communication Design Tools

Highlighting a paragraph in LaTeX #2

In my  last post regarding this topic I couldn't really find a solution to highlight the background of a a paragraph in LaTeX when it contained block-level elements like citations.

What actually does work exactly as I wanted it to in the first place is using the  todonotes package:

\usepackage[bordercolor=white,backgroundcolor=gray!30,linecolor=black,colorinlistoftodos]{todonotes}
\newcommand{\rework}[1]{\todo[color=yellow,inline]{Rework: #1}}

Now marking up a paragraph in \rework{paragraph(s)} results in:

http://consense-project.com/raw-attachment/blog/latex_highlight_text_2/paragraph-highlight2.png

edit: this throws yet another an error

! TeX capacity exceeded, sorry [input stack size=1500].

when the paragraph contains a footnote. Oh well - will have to do for now anyway.

Highlighting a paragraph in LaTeX

edit: see  Highlighting a paragraph in LaTeX #2 for a better solution to this.

When writing a text in Microsoft Word I like to highlight sections with a "neon marker" effect to have a clear visual guidance which parts still need some reviewing or for example need additional citations added. So basically this is how it should look in a LaTex generated PDF as well:


http://consense-project.com/raw-attachment/blog/latexhighlighttext/paragraph-highlight.png


Googling for  latex highlight paragraph blog posts like e.g.  devdaily.com suggest the following approach to highlight a paragraph:

Add the following to your preamble:

\usepackage{color}
\newcommand{\hilight}[1]{\colorbox{yellow}{#1}}

To highlight text in the body of your document, use

\hilight{this is some highlighted text}

There is a problem with the previous method is that the color box does not wrap properly. Here's a much easier and robust way of doing this

\usepackage{soul}

To highlight text in the body of your document, use

\hl{this is some highlighted text}

Unfortunately this approach works only if the highlighted paragraph contains no other nested block elements like a \citep tag. If you, like me, use such elements in the text you want to highlight you will get error messages which makes the Soul package quite unusable for the original purpose. Unfortunately I cant offer you a direct solution for this problem, but at least in my case highlighting the text itself (instead of marking up the background of the text) did its job.

For this use Preamble:

  \usepackage[usenames,dvipsnames]{color}
  \newcommand{\markup}[1]{{\color{Cerulean}{#1}}}

In the text: \markup{whatever text including \citep[p. 5f]{Brügmann.2010} lorem ipsum bla bla}

which results in:


http://consense-project.com/raw-attachment/blog/latexhighlighttext/paragraph_markup.png


If this causes an error of type option clash for package color you can comment out the /usepackage[][... line and simply include the dvipsnames switch in your global \documentclass options e.g.:

\documentclass[a4paper,oneside,dvipsnames]{book}

Finally for changing the color style in the /markup command see  WikiBooks:LaTeX/Colors.

Package natbib error: Bibliography not compatible with author year citation

LaTeX error message upon build:

Natbibbibliography not compatible with author year citation

In my case this was caused by a missing year entry in a bib-entry in  Mendeley. This caused something along these lines in bibliography.aux (important is the last line missing a year column):

\bibcite{Wurman.2000}{{27}{2000}{{Wurman}}{{}}}
\bibcite{Wurman2001}{{28}{2001}{{Wurman et~al.}}{{}}}
\bibcite{Xerox2008}{{29}{{Xerox Corporation}}{{}}}
  • add the missing year to whatever bib entry is missing that datum (either directly in the .bib file or indirectly in jabref or e.g. mendeley)
  • clear your bibliography.aux file
  • rebuild

Exporting Mediawiki Articles into Trac Wiki Pages

http://consense-project.com/raw-attachment/blog/mediawiki_export_to_trac-wiki/converter-icon.pngThe problem doesnt seem to be exactly new, but apparently not as common as one might think either: You want to migrate your  MediaWiki wiki pages into a (existing) Trac installation.

The following is the indirect path to the solution with explanations on how things work. for the direct solution go to section Direct Solution.

Indirect Solution

The  TracHelpPage says regarding this issue:

You can use the  attachment:wiki:TracWiki:mediawiki2trac.py Download script as a starting point. WikiProcessor for the MediaWiki styles has been started as trac plugin on  http://trac-hacks.org/wiki/MediaWikiPluginMacro.

You are supposed to download the mentioned mediawiki2trac.py from the above link, copy it into a directory on your (web)server and execute it with e.g.:

    python /PATH/mw2tw.py > mwexport-mediawiki2trac.sql

Unfortunately the script does only some minor importing without regard to versions, attachments (i.e. associated files) and redirects etc.

A bit more interesting is the following Trac ticket:  http://trac.edgewall.org/ticket/5241 especially regarding the different handling of attachment/media-files in MediaWiki and TracWiki. MediaWiki handles attached files in a completely different way than Trac does:

In Trac, you have attached files that are associated with a given page. Whereas in MediaWiki you have files that are uploaded and it's up to the wiki-editors to create links to the uploaded documents.
So, the process for dealing with "attachments" depends on a couple of factors.
If you want your downloaded documents to be unique in Trac like they are in mediawiki, then that's a problem. AFAIK, uniqueness in mediawiki is based on the filename whereas in Trac it's a combination of filename and what Wiki page the attachment is associated with.
So if you have an attachment that is linked in multiple places from on your MediaWiki page, then to have the same effect you'd either have to set up an independent web for these attachments (so all things would link to the same file) or you need to give up on the notion that the file is unique and that you have attachments that represent copies of the file in question.

So - whats wrong here:

  • First we don't want redirect pages to create stub-pages in Trac with no redirect functionality.
  • Secondly existing media-references should be kept intact.

As I am not really proficient with Python I reimplemented the migration script with PHP:

Features

  • Import directly from the MediaWiki MySql database into the Trac Sqlite3 database file
  • Convert basic MediaWiki markup to Trac markup (so no  separate MediaWiki parser for trac is needed)
  • Optionally remove category entries from wiki pages
  • Preserving Image/Media links. This assumes you manually copy the old mediawiki attachment files into a new flattened out directory (without the hashed subdirectories)
  • Optionally migrate revision comments
  • Optionally keep redirect-links intact. Even though Trac doesn't know redirects as MediaWiki does the script will change links to redirect pages so the link directly points to the real page
  • Optionally discard Stub pages
  • Optionally discard pages with fewer than X bytes
  • Remove custom specified substrings from pages (e.g. old MediaWiki templates)
  • Replace custom specified substrings with replacement-strings (e.g. custom copyright notices)

Limitations

There are some limitations though:

  • Complex markup (e.g. tables) will not be converted
  • As the script directly accesses the Trac Sqlite3 database file the corresponding Sqlite3 extension needs to be enabled in PHP
  • I personally didnt need this feature, so the script discards old revisions - if you take a look at the old python version of this script it shouldnt be hard to implement this though

Direct Solution

  1. Backup your trac database file - seriously - do it!
  2. Download and unzip the following script into a web-accessible directory on your webserver:  mediawiki2trac.zip
  3. Edit the settings section at the start of the script - comments should hopefully be self-explanatory.
  4. Call mediawiki2trac.php in a web browser.
    1. You need Sqlite3 support in PHP for the script to work. If you get an error message like Fatal error: Class 'SQLite3' not found in.... you are missing the extension. See for example  http://ubuntuforums.org/showthread.php?t=891767 on how to install this on Ubuntu.
    2. When calling the script it first starts in test-mode showing you what replacements etc would be done. Activate live processing by clicking on the given link.

  • Posted: 2010-04-14 02:53 (Updated: 2010-04-16 03:09)
  • Author: hkbruegm
  • Categories: trac
  • Comments (0)

Truebert

Dilbert.com
  • Posted: 2010-04-12 18:27 (Updated: 2010-04-13 19:04)
  • Author: hkbruegm
  • Categories: (none)
  • Comments (0)

Going Sans-Serif in LaTeX

While I am quite a fan of LaTeX I often had - when looking at LaTeX created documents - the feeling of the general look to be too "latexy". Actually similar to the feeling MS Office 2003 gave with Arial and Office 2007 gives with the omnipresent use of Calibri.

So - if you aren't a die-hard Serif-only evangelist I can recommend the  Kepler fonts packaged in  kpfonts

\usepackage[largesmallcaps,nofligatures]{kpfonts} 
\renewcommand*\familydefault{\sfdefault} 

\usepackage{sectsty}
\allsectionsfont{\textothersc}

Result:

http://consense-project.com/raw-attachment/blog/latex-sans-serif-kpfonts/kpfont-example.png

Using Mendeley to manage BibTex References and Citations

http://consense-project.com/raw-attachment/blog/references/search_1267784181.gif  Mendeley Desktop organizes your research paper collection and citations. It automatically extracts references from documents, generates bibliographies, and is freely available on Windows, Mac OS X and Linux.
 Mendeley Web lets you access your research paper library from anywhere, share documents in closed groups, and collaborate on research projects online. It connects you to like-minded academics and puts the latest research trend statistics at your fingertips.


Research papers collected using Mendeley

XNA: The specified Module could not be found

http://consense-project.com/raw-attachment/blog/xna_specified_module_could_not_be_found/xna.pngDuring development of the ConSense Cockpit solution? I experienced (see 348) a problem when deploying a Windows Presentation Foundation (WPF) solution on a target client with no XNA (btw - an extremely lame acronym from Microsoft) installed which resulted in:

The specified module could not be found. (Exception from HRESULT: 0x8007007E)

For me the trick was to include the libraries:

  1.  d3dx9_31.dll
  2.  msvcr80.dll
  3.  X3DAudio1_4.dll

in the solution project and have them copied in a post-build event to the deployment folder (/installDir).

Due to the surely incoming copyright and litigation infringements I hope you understand I just linked the respective google queries instead of uploading the dll's myself ;-(

Trac and Google

http://consense-project.com/raw-attachment/blog/trac_google/google.pngNo idea why, but I couldn't get custom files in Trac's /htdocs/ folder to be accessible using e.g.

<Location "/robots.txt">
    SetHandler None
</Location>

This caused some headaches on how to make this Trac installation / Blog accessible to Google. In the end three plugins helped to

  1. make Google see my robots.txt:  RobotsTxtPlugin
  2. verify the site in Google Webmaster Central:  GoogleWebmasterVerifyPlugin
  3. enable Google Analytics:  TracGoogleAnalytics

The usual

    sudo easy_install svn-path-to-the-plugin

and restarting apache was all to get those 3 working.

  • Posted: 2010-03-29 21:40 (Updated: 2010-03-30 02:19)
  • Author: hkbruegm
  • Categories: trac
  • Comments (0)

Assisting the Discovery and Reuse of Document-based Knowledge using semantic Metadata

Extracting Metadata from Office 2007 XML Documents

http://consense-project.com/raw-attachment/blog/extracting_metadata_office_2007_xml_documents/docx-icon.jpgApparently you can use  DSOfile for Open XML documents, but at least for me there were problems in x64 environments - see  Stackoverflow: Unable to read Office 2007 doc props using x86 dsofile.dll on x64 system: In this 64-bit environment, DSOFile.dll can successfully read properties from Office 2003 documents (eg. DOC), but in the case of Office 2007 documents (eg. DOCX), only empty strings are returned for all properties, or else an error is generated.

So - use the  Open XML SDK 2.0 for Microsoft Office. After downloading and referencing the dll:

using DocumentFormat.OpenXml.Packaging;


WordprocessingDocument document = WordprocessingDocument.Open(filePath, false);
            
PackageProperties standardProperties = document.PackageProperties;

CoreFilePropertiesPart fileProperties = document.CoreFilePropertiesPart;
            
ExtendedFilePropertiesPart extendedPropertiesContainer = document.ExtendedFilePropertiesPart;
if (extendedPropertiesContainer != null)
{
    var extendedProperties = extendedPropertiesContainer.Properties;
}
            
var customPropertiesContainer = document.CustomFilePropertiesPart;
if (customPropertiesContainer != null)
{
    var customProperties = customPropertiesContainer.Properties;
}

Upgrading TeamCity - Catalina_Home and Basedir Issues

On Ubuntu:

The BASEDIR environment variable is not defined correctly. This environment variable is needed to run this program

Dont fall into the same trap as me and remember to give executive permissions to the .sh and .jar files in TeamCity/bin after uploading the new  TeamCity files from Windows.

http://consense-project.com/raw-attachment/blog/teamcity_basedir_catalina_home/teamcity_startup.png

By the way - to include a build-status of your Teamcity installation in trac create a file TeamCityPlugin.py in your trac/plugins folder:

from trac.wiki.macros import WikiMacroBase
import urllib

class TeamCityStatusMacro(WikiMacroBase):
    """Inserts the current build status into the wiki page."""

    def expand_macro(self, formatter, name, args):
    	res=urllib.urlopen('Http://path-to-teamcity:8111/externalStatus.html').read()
	return res

Reusing Subversion-managed Library Projects in Visual Studio

http://mikesansone.typepad.com/photos/uncategorized/stopwatch_1.jpgJust posting this as it would have saved me some time myself back then:

In the  ConSense Prototype I have some custom libraries (especially the proprietary RDF-framework) which are used in multiple application-solutions (WPF Cockpit,  VSTO Plugins, Windows/WCF Service, Plugins for those Service etc..).

So - even if you use Subversion this still leaves the question open how to keep the libraries in sync, assuming they are included in multiple Visual Studio solutions and potentially simultaneous changes before a check-in can occur.

The solution for me - and I hope you allow some advertisement for a commercial software here - was the combination of  svn:externals and the  VisualSVN Visual Studio plugin.

The libraries are kept in a separate directory in subversion and referenced as an external in the "main" application's subversion properties:

Select the application directory in subversion (Tortoise -> Repo-browser) and in the properties section (right click on the directory and select properties) add a new key "svn:externals". The value looks something like

Externals/ConSense.Generic.Rdf http://subversion.consense-project.com/2_Generic/ConSense.Generic.Rdf

with the first part being the directory the external sources will be included in in your application solution upon checkout and the second being the svn-path to the external. When you now checkout the application using VisualSVN the library source files are automatically checked out as well in the (above screenshot: /Externals/) directory you specified as first parameter in the properties.

The only caveat is that the global checkout-button of VisualSVN does only checkout the main solution, not the included externals so additional manual checkouts (right click on the external-project --> subversion --> checkout)are needed.

Resolving RDF Relationship-targets at Runtime

From  Jeni Tennison: On Resolvability:

There are three ways of locating the documentation about a particular property or class:

1) looking through the general documentation the data publisher has provided
2) resolving the URI of the class or property
3) searching

Using URIs for classes and properties provides a mechanism for applications to get hold of this extra information about unknown vocabularies. They might try four tactics, in order of priority:

1) look at the data they already know; the information they need about the unknown properties and classes may be included in the files they’ve already accessed (including those containing data)
2) look in an application-specific (possibly cloud-hosted) cache of vocabularies that the application has already downloaded
3) resolve the URI of the class or property by performing an HTTP GET (and add it to the application-specific cache)
4) look in a general-purpose cache, such as the Internet Archive or an ontology repository such as Swoogle

Generic API's:

SPARQL Endpoints:


edit: strange - a Sindice Api call returns an XML resultset alright, but the contained URI of that query itself (seems to be ment as a self-reference) leads nowhere. See for example:

Open Research in Information Systems

http://repairstemcell.files.wordpress.com/2009/03/unhappy-smiley.jpgI am constantly frustrated that the EU or whoever is so kind as to sponsor research activities does not force the respective institutes to open up the underlying program code of whatever prototypes or tests were developed/conducted as part of their funding obligations.

Rigorous empirical/behavioral research is fine and gay, but the more practical advances in our field require an enormous amount of coding hours. So while I am able to read quite interesting and clever papers on previous research activities the most basic of wheels regarding prototype development have to be reinvented over and over. Even if scientific publications actually reveal the core algorithms used in a  modern (rich) client prototype the complexity of such applications makes rebuilding that piece of software (which is often necessary to actually "go further") far beyond trivial.

Add the fact that quite a bit of the baseline research in IS is performed in a small-group or even individual-closed-closet type of environment and also that those individuals have their own agenda of not spending 12 years on a ph.d. and you end up with an auto-selected choice of individual research scope which is accomplishable using existing open-source libraries/platforms (and therefore far smaller than desirable). This leads to the question of who is helping whom here - shouldn't the open source community benefit from publicly tax-funded research instead of vice versa?

I wont even get started about hype-topics appearing in blogs 1 year before they do in Gartner Hypecycles and 2 years before they do on research conferences, but please please - allow us dwarves to step on the shoulders of other dwarves to accomplish more gigantic tasks.

edit: this rant was kind of spawned by my frustration that  Microsoft Research lets their activity logger  PersonalVibe2 (which they kindly provided - in a compiled form - to the public) rot unusable in Win7/64bit systems even if only a modification of supposedly 5 minutes would be required to make it work again. Of course as a private company MS is free do as they wish and not the real target of the above paragraphs. Still: facepalm.

WCF inside Windows Service: HTTP could not register URL http://+:8000

Following the  usual-subject kind of MSDN tutorials to make a Windows Service able to communicate with a WPF application (that is have a WCF service hosted inside your Windows Service) I ended with this kind of app.config section:

<services>
      <service name="ConSense.Sensors.Service.ConSenseSensorsWcfService"
               behaviorConfiguration="ConSenseSensorsWcfServiceBehavior">
        <host>
          <baseAddresses>
            <add baseAddress="http://localhost:8000/ConSense/service"/>
          </baseAddresses>
        </host>
        <!-- this endpoint is exposed at the base address provided by host: http://localhost:8000/ConSense/service  -->
        <endpoint address=""
                  binding="wsHttpBinding"
                  contract="ConSense.Sensors.Service.IConSenseSensorsWcfService" />
        <!-- the mex endpoint is explosed at http://localhost:8000/ConSense/service/mex -->
        <endpoint address="mex"
                  binding="mexHttpBinding"
                  contract="IMetadataExchange" />
      </service>
    </services>

In a  UAC environment (Vista/Win7) this leads to the error message shown in the title of this post: HTTP could not register URL http://+:8000 -meaning your LocalService account is not allowed to open such an endpoint.

Maybe due to SOA hype or whatever every tutorial and helpfile around seems to assume you want to build a supercool SOAP Business Service exposed over the internet the moment you set eyes on WCF. This leads to the top 10 Google solutions regarding this exception to suggest you should just run Visual Studio with admin privileges and be done with it. The better answer (at least for my scenario) was to skip the http-stuff and use a named pipe as a WCF endpoint instead:

<services>
      <service name="ConSense.Sensors.Service.ConSenseSensorsWcfService"
               behaviorConfiguration="ConSenseSensorsWcfServiceBehavior">
        <host>
          <baseAddresses>
            <add baseAddress="net.pipe://localhost/ConSenseSensorsService"/>
          </baseAddresses>
        </host>
        <!-- this endpoint is exposed at the base address provided by host: net.pipe://localhost/ConSenseSensorsService  -->
        <endpoint 
                  address="net.pipe://localhost/ConSenseSensorsService"
                  binding="netNamedPipeBinding"
                  bindingConfiguration="netNamedPipeBindingUnsecure"
                  contract="ConSense.Sensors.Service.IConSenseSensorsWcfService" />
        <!-- this mex endpoint is exposed at the base address provided by host: net.pipe://localhost/ConSenseSensorsService/mex -->
        <endpoint address="mex"
                  binding="mexNamedPipeBinding"
                  contract="IMetadataExchange" />
      </service>
    </services>

    <bindings>
      <netNamedPipeBinding>
        <binding name="netNamedPipeBindingUnsecure" >
          <security mode = "None">
          </security>
        </binding >
      </netNamedPipeBinding>
    </bindings>

    <!--For debugging purposes set the 
        includeExceptionDetailInFaults attribute to true-->
    <behaviors>
      <serviceBehaviors>
        <behavior name="ConSenseSensorsWcfServiceBehavior">
          <serviceMetadata />
          <serviceDebug includeExceptionDetailInFaults="True" />
        </behavior>
      </serviceBehaviors>
    </behaviors>

Debugging a Windows Service

This may be blatantly obvious for most developers, but just to save someone the time to figure this out by himself: To attach the debugger to your process put some sleep into the service's Main method so you got enough time to attach before whatever may be causing an exception

System.Threading.Thread.Sleep(10000);

Custom CSS for Trac

http://misc.consense-project.com/Media/trac_colorful.pngI really love Trac - dont get me wrong - but red just isnt my color. So I made a couple halfhearted moves in the past to style my Trac installation (the one you are looking at atm) a bit but always had a deja vú when stumbling across  Trac: Customizing the Trac Interface as it - in my understanding - ment I had to learn the templating engine used by Trac (Genshi). Being a Python-noob myself investing a lot of time digging through sources wasn't exactly desirable either.

Luckily  Trac-Hacks.org hosts a plugin called  ThemeEnginePlugin which allows replacing the default Trac theme with custom theme-packs. There are a couple pre-prepared theme-packs available at  Trac-Hacks:Themes but half of them look broken (at Trac 0.11) and actually I was quite satisfied to find that the ThemeEnginePlugin allows you to append custom CSS to every Trac-page letting you overwrite whatever suits your needs in the default CSS definitions. Just remember to update your trac.ini (and restart apache afterwards) with

[theme]
enable_css = true

Thanks  Noah Kantrowitz aka coderanger!

Ah yes - just as it took me a while to figure this out back then: inserting arbitrary html in Trac wikipages can be done with

{{{
#!html
HTML HERE
}}}
  • Posted: 2010-03-16 19:43 (Updated: 2010-03-17 14:34)
  • Author: hkbruegm
  • Categories: trac
  • Comments (0)

Log4Net in a Windows Service

http://intelligentforms.net/cms/wp-content/themes/iF/imgs/page_laptoppillows_log_red.jpgRequirement: I want a Windows Service

Seems easy, but wut - no logfile created!

The help at  Apache FAQ: Why doesn't the logging in my service work? is fine, but maybe I just skipped over it - it took me a while to figure out that the problem of log4net not showing any results (meaning: writing out a logfile) was related to the windows service not being able to find my application.config file (cause the service was called from a different location than my output dir).

The Apache FAQ suggests using AppDomain.BaseDirectory which unfortunately fails when I want to initialize the logger in the Services Main() method due to Main being static. Copying the app.config to lets say c: and using

log4net.Config.XmlConfigurator.Configure(new FileInfo("c:\\app.config"));

works, but isnt exactly what we want - at least it should use some temporary or application-specific dir in %appdata%


edit: time to bang head against something again - a failure in the configuration file caused the problem.

This line allowed logging to a client-specific directory:

<file type="log4net.Util.PatternString" value="%env{APPDATA}\\ConSense\\Logs\\ConSense.Sensors.Service.ConsoleStart.Log.txt" />

Btw - a very nice introduction to Log4Net:

ConSense Project

Management of unstructured Information using semantic Metadata

The amount of unstructured electronic documents in enterprise environments is growing rapidly. The  ConSense project aims to assist the enterprise wide lifecycle management of electronic documents by utilizing the context of document access by knowledge workers. From this context data one can deduct semantic relations among documents and business-domain specific entities which can be combined into a semantic network. Querying the resulting network allows for the discovery and reuse of unstructured documents.

http://misc.consense-project.com/Logos/ConSense_research.png

Actual Business Domain
Knowledge workers use, create and collaborate with documents in business processes.
Context Sensors
Client-side software plugins track the context of business-relevant document usage and submit it in condensed form to the central semantic virtualization-store.
Semantic Virtualization
The context information is analyzed using heuristic business-rules resulting in semantic relationships among documents, persons, products, processes and services within the domain.
Task-specific Information
The semantic relationships are used to proactively supply knowledge workers with information and documents related to their actual task-context.


ConSense is a project of the  Department for Information Systems II (Service and Process Management -  Prof. Bodendorf)
 University of Erlangen-Nürnberg
Project contact:  Hinnerk Bruegmann

Student Participation - Master Thesis

  • Sauer, Dieter: Automatic Detection of HCI Action Clusters and corresponding Workflow Patterns (Automatische Erkennung von Aktionsclustern und deren Zuordnung zu Workflows)
  • Sauer, Dieter: Using Metaanalysis to extract contextual Information from the local File System
  • Heckl, Christian: Interaktive Informationsvisualisierung im "Document Lifecycle Management"
  • Steininger, Oliver: Erkennung von E-Mail Konversationsfäden (Recognizing Conversation Threads in Email Communication)

Student Participation - Term Papers

  • Heckl, Christian; Sauer, Dieter: Unterstützung der Analyse von Informationszusammenhängen durch selektive Visualisierung (Supporting the Analysis of interconnected Information by selective Visualization)

.