Image
Caption

A close-up view of a brown horse. The horse has a thick white blaze running along its face, and next to its face is a rusted-out fence that has thick and thin poles running vertically and horizontally. The top of the head is filled with blonde hair that runs along the horse's neck. Along the backside of the horse, the stomach and rib cage stand out. The feet of the horse are white, and they are standing on a dirt pathway. Behind the horse, there is a grass field that has shade and sunlight. On the other side of the grass field, there is a tree line that is filled with dark green leaves.

Scene Graph
    Scene Graph
QA + verification programs
Q: How is the horse related to its surroundings spatially?
A: The horse is on a dirt pathway, in front of a grass field and next to a rusted-out fence.
   0 def execute_command(scene_graph):                                          
   1     for entity in scene_graph.get_entities():                              
   2         if "horse" in entity:                                              
   3             outgoing_relations = scene_graph.get_outgoing_relations(entity)
   4             if 'dirt pathway' in outgoing_relations and                    
     outgoing_relations['dirt pathway']['spatial'] == 'on':                     
   5                 if 'grass field' in outgoing_relations and                 
     outgoing_relations['grass field']['spatial'] == 'in front of':             
   6                     face_relations =                                       
     scene_graph.get_outgoing_relations("horse's face")                         
   7                     if 'fence' in face_relations and                       
     face_relations['fence']['spatial'] == 'next to':                           
   8                         return "The horse is on a dirt pathway, in front of
     a grass field and next to a rusted-out fence."                             
   9     return None                                                            
  10                                                                            

gpt-4o: The horse in the image is standing behind a metal gate or fence. Its head is positioned close to the bars of the gate, and it appears to be looking through the gaps. The horse is in an outdoor setting, with a dirt or gravel ground beneath it. In the background, there are green fields and some trees, indicating a rural or farm environment. The horse is spatially confined within the fenced area, and its proximity to the gate suggests it might be interested in something on the other side.
hscore: 67.08%
• <spatial (horse, fence, next to)> == <spatial (horse's head, gate, close to)> (80.0%)
• <material (pathway, dirt)> == <material (ground, dirt or gravel)> (70.0%)
• <spatial (horse, pathway, on)> == <spatial (horse's head, gate, close to)> (70.0%)
tscore: 79.06%
• <whole (horse)> == <whole (horse)> (100.0%)
• <state (horse, standing)> == <state (horse, stand)> (90.0%)
• <part (horse's head)> == <part (horse's face)> (90.0%)

Q: What does the top of the horse's head consist of visually?
A: The top of the horse's head is filled with blonde hair that runs along the horse's neck.
   0 def execute_command(scene_graph):                                          
   1     for entity in scene_graph.get_entities():                              
   2         if "top of the head" in entity:                                    
   3             hair_relations = scene_graph.get_incoming_relations("top of the
     head")                                                                     
   4             if 'hair' in hair_relations and                                
     hair_relations['hair']['spatial'] == 'fill':                               
   5                 hair_attributes = scene_graph.get_attributes('hair')       
   6                 if hair_attributes.get('color') == 'blonde':               
   7                     horse_neck_relations =                                 
     scene_graph.get_outgoing_relations('hair')                                 
   8                     if "horse's neck" in horse_neck_relations and          
     horse_neck_relations["horse's neck"]['spatial'] == 'run along':            
   9                         return "The top of the horse's head is filled with 
     blonde hair that runs along the horse's neck."                             
  10     return None                                                            
  11                                                                            

gpt-4o: The top of the horse's head in the image consists of a mane with long, light-colored hair that falls over its forehead. The mane appears to be a light blonde or cream color, contrasting with the darker brown color of the horse's coat. The horse's ears are also visible, standing upright and pointing slightly outward. The mane covers part of the horse's eyes, giving it a somewhat shaggy appearance.
hscore: 92.95%
• <part (horse's head)> == <part (horse's head)> (100.0%)
• <spatial (horse's hair, horse's neck, along)> == <spatial (horse's mane, horse's eyes, covers)> (90.0%)
• <color (horse's hair, blonde)> == <color (horse's mane, light blonde or cream)> (90.0%)
tscore: 84.46%
• <whole (horse)> == <whole (horse)> (100.0%)
• <part (horse's head)> == <part (horse's face)> (90.0%)
• <color (horse's coat, darker brown)> == <color (horse, brown)> (90.0%)
Image
Caption

A medium-close-up view of a layered cake that is brown. The cake is made up of two layers, the top portion of the cake is covered in white cream, while the second portion of the cake is also covered in white cream. The portion also has a small decorative piece located at its center. The cake is sitting on a cardboard cutout that is shaped like a circle. The cake is sitting on a gray plate that is circular and has white trimming. Underneath the plate, there is a white cake stand that has brown trimming. Underneath the stand is a brown mat that is made of lines that run vertically and horizontally. On the left side of the mat, there is a metal holder. The cake is placed inside a fridge.

Scene Graph
    Scene Graph
QA + verification programs
Q: What supports the cake stand which holds the cake?
A: The cake stand, which holds the cake, is supported by a brown mat.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'mat' in entity:                                                 
  3             relationships_to = scene_graph.get_outgoing_relations(entity)   
  4             if 'stand' in relationships_to and 'underneath' in              
    relationships_to['stand']['spatial']:                                       
  5                 return entity                                               
  6     return None                                                             
  7                                                                             

gpt-4o: The cake stand holding the cake is supported by a pedestal base. The pedestal base is wide at the bottom and narrows as it goes up to support the flat surface on which the cake is placed. This design provides stability and elevates the cake for better presentation.
hscore: 66.62%
• <spatial (cake stand, mat, supported by)> == <spatial (cake stand, pedestal base, on)> (80.0%)
• <spatial (cake, cake stand, on)> == <spatial (cake stand, pedestal base, on)> (80.0%)
• <whole (cake)> == <whole (cake stand)> (80.0%)
tscore: 65.08%
• <whole (cake stand)> == <whole (cake stand)> (100.0%)
• <state (cake stand, presentation)"> == <state (cake, sit)> (80.0%)
• <spatial (cake stand, pedestal base, on)> == <spatial (cake stand, plate, underneath)> (80.0%)

Q: What entity is both white and underneath the circular gray plate?
A: The cake stand is white and located underneath the circular gray plate.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'cake stand' in entity:                                          
  3             attributes = scene_graph.get_attributes(entity)                 
  4             relations_out = scene_graph.get_outgoing_relations(entity)      
  5             if 'plate' in relations_out and 'underneath' in                 
    relations_out['plate']['spatial']:                                          
  6                 return attributes.get('color', "")                          
  7     return None                                                             
  8                                                                             

gpt-4o: The white entity underneath the circular gray plate is a white cake stand.
hscore: 91.25%
• <color (cake stand, white)> == <color (cake stand, white)> (100.0%)
• <whole (cake stand)> == <whole (cake stand)> (100.0%)
• <spatial (cake stand, circular plate, underneath)> == <spatial (circular gray plate, cake stand, underneath)> (90.0%)
tscore: 90.75%
• <color (cake stand, white)> == <color (cake stand, white)> (100.0%)
• <whole (cake stand)> == <whole (cake stand)> (100.0%)
• <spatial (cake stand, circular gray plate, under)"> == <spatial (cake stand, plate, underneath)> (90.0%)
Image
Caption

An outdoor angled down view of a portion of a field that is covered with tall white spring star flowers that are rooted into a ground floor made up of moist dark colored soil and also consists of small green plants. Behind the field covered in white flowers is another field that is made up of only dark colored soil.

Scene Graph
    Scene Graph
QA + verification programs
Q: Which field is in front of the dark soil field?
A: Field 1 is in front of the dark soil field (Field 2).
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'field_2' in entity:                                             
  3             incoming_relations = scene_graph.get_incoming_relations(entity) 
  4             for source_entity in incoming_relations:                        
  5                 if 'field_1' in source_entity and 'in front of' in          
    incoming_relations[source_entity]['spatial']:                               
  6                     return source_entity                                    
  7     return None                                                             
  8                                                                             

gpt-4o: The field in front of the dark soil field is a field of white flowers. The flowers are scattered across the area, growing among green grass and other small plants.
hscore: 62.67%
• <state (Field 2, dark soil)> == <state (field with dark soil, front)> (90.0%)
• <whole (Field 1)> == <whole (field with white flowers)> (50.0%)
• <whole (Field 2)> == <whole (field with white flowers)> (50.0%)
tscore: 83.89%
• <state (field with white flowers, scattered)> == <state (field, covered with flowers)> (90.0%)
• <color (field with white flowers, white)> == <color (field's flowers, white)> (90.0%)
• <state (field with white flowers, growing)> == <state (field, covered with flowers)> (80.0%)

Q: What is the state of the plants in the field covered with white flowers?
A: The plants in the field covered with white flowers are small and green.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'plants' in entity:                                              
  3             return                                                          
    scene_graph.describe(scene_graph.generate_subgraph([entity]))               
  4     return None                                                             
  5                                                                             

gpt-4o: The plants in the field appear to be in good health. They are covered with numerous white flowers, indicating that they are in bloom. The green stems and leaves suggest that the plants are well-nourished and thriving. There are no visible signs of disease or significant damage.
hscore: 83.9%
• <part (plants' flowers)> == <part (plants' flowers)> (100.0%)
• <color (plants' flowers, white)> == <color (plants' flowers, white)> (100.0%)
• <state (plants, small)> == <state (plants, no disease)> (80.0%)
tscore: 82.48%
• <color (plants' leaves, green)> == <color (plants, green)> (90.0%)
• <state (plants, in bloom)> == <state (plants, small)> (80.0%)
• <color (plants' stems, green)> == <color (plants, green)> (80.0%)
Image
Caption

A monster truck is in the air and jumping over a line of red crushed cars. The truck has a color scheme of red and blue. The word "BIG FOOT" is written on the side of the truck in big white letters. The grille of the truck is silver. An orange truck with black stripes is in the background behind the truck red and blue truck. The word "Hot Wheels" is inside a flame logo on the blue wall. The "H" is partially covered by the truck's shark fin design. An orange van is partially cut off from the right of the image next to a crushed blue car. An audience of people are in the background in the stands watching the trucks.

Scene Graph
    Scene Graph
QA + verification programs
Q: Which truck is behind the red and blue truck?
A: The orange truck with black stripes is behind the red and blue truck.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'truck' in entity:                                               
  3             relations = scene_graph.get_outgoing_relations(entity)          
  4             for relation, details in relations.items():                     
  5                 if "red and blue" in                                        
    scene_graph.get_attributes(relation).get('color', "") and 'behind' in       
    details.get('spatial', ""):                                                 
  6                     return scene_graph.get_attributes(entity).get('color',  
    "")                                                                         
  7     return None                                                             

gpt-4o: The truck behind the red and blue truck is an orange and black truck with a tiger-like design.
hscore: 89.73%
• <color (red and blue truck, blue)> == <color (red and blue truck, blue)> (100.0%)
• <color (red and blue truck, red)> == <color (red and blue truck, red)> (100.0%)
• <state (orange truck, has black stripes)> == <state (orange and black truck, behind)> (90.0%)
tscore: 84.99%
• <color (red and blue truck, red)> == <color (truck, red and blue)> (90.0%)
• <color (orange and black truck, black)> == <color (truck, orange)> (90.0%)
• <color (red and blue truck, blue)> == <color (truck, red and blue)> (90.0%)

Q: What design feature is partially covered by the truck?
A: The blue wall's "H" is partially covered by the truck's shark fin design.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if "H" in entity:                                                   
  3             relations = scene_graph.get_outgoing_relations(entity)          
  4             for relation, details in relations.items():                     
  5                 if "shark fin design" in relation and "partially covered by"
    in details.get('spatial', ""):                                              
  6                     return 'The "H" is partially covered by the truck’s     
    shark fin design.'                                                          
  7     return None                                                             

gpt-4o: The truck in the image is partially covering a Hot Wheels logo on the blue banner in the background.
hscore: 55.62%
• <spatial (truck's shark fin design, wall's text, over)> == <spatial (truck, Hot Wheels logo, covering)> (70.0%)
• <color (wall, blue)> == <color (blue banner, blue)> (70.0%)
• <whole (wall)> == <whole (truck)> (60.0%)
tscore: 73.54%
• <whole (truck)> == <whole (truck)> (100.0%)
• <spatial (truck, Hot Wheels logo, covering)> == <image> (70.0%)
• <color (blue banner, blue)> == <image> (70.0%)
Image
Caption

A front view of a stone sculpture of a Trojan warrior with a Trojan helmet, shield on his left hand with dirt on it, and a metal spear. The sculpture is on top of a rock with a bigger rock behind it. The rocks are in a fountain of water. Some parts of the water are bubbling and raising from the surface as if it is a mini geyser on the left and right of the sculpture. Light is shining from beneath the water surface. The wing of another sculpture is seen to the left of the Trojan warrior. Another white sculpture is in the background behind the Trojan warrior on a bigger rock formation. A stone pillar is to the right of the Trojan Warrior. Two tops of stone formations are next to each other in the background. Shadows are being cast on the stones. A clear blue nighttime sky is above the sculptures.

Scene Graph
    Scene Graph
QA + verification programs
Q: Where is the stone pillar located in relation to the Trojan Warrior?
A: The stone pillar is to the right of the Trojan Warrior.
  0 def execute_command(scene_graph):                                           
  1     location = None                                                         
  2     for entity in scene_graph.get_entities():                               
  3         if 'stone pillar' in entity:                                        
  4             outgoing = scene_graph.get_outgoing_relations(entity)           
  5             if 'Trojan Warrior' in outgoing:                                
  6                 location = outgoing['Trojan Warrior'].get('spatial', '')    
  7     return f"The stone pillar is {location} the Trojan Warrior." if location
    else None                                                                   

gpt-4o: The stone pillar is located to the right of the Trojan Warrior in the image.
hscore: 89.4%
• <spatial (stone pillar, Trojan Warrior, to the right of)> == <spatial (pillar, Trojan Warrior, to the right of)"> (80.0%)
tscore: 91.88%
• <whole (Trojan Warrior)> == <whole (Trojan Warrior)> (100.0%)
• <whole (pillar)> == <whole (stone pillar)> (80.0%)
• <spatial (pillar, Trojan Warrior, to the right of)"> == <spatial (stone pillar, Trojan Warrior, to the right of)> (80.0%)

Q: What is the condition of the water in the fountain?
A: The water is behaving like a mini geyser, bubbling, and rising from the surface.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'water' in entity:                                               
  3             state = scene_graph.get_attributes(entity).get('state', '')     
  4             return f"The water is {state} in the fountain."                 
  5     return "No water condition information found."                          

gpt-4o: The water in the fountain appears to be clear and flowing smoothly. You can see the water cascading down the rocks in the background, and there are no visible signs of debris or discoloration. The overall condition of the water looks clean and well-maintained.
hscore: 63.17%
• <state (water, surface)> == <state (water, smooth)> (70.0%)
• <state (water, bubbling)> == <state (water, smooth)> (70.0%)
• <action (water, rising)> == <state (water, flowing)> (50.0%)
tscore: 71.27%
• <whole (water in the fountain)> == <whole (fountain)> (90.0%)
• <spatial (water, rocks, cascading down)> == <spatial (rocks, fountain, in)> (80.0%)
• <state (water, clear)> == <state (sky, clear)> (70.0%)
Image
Caption

An eye level side view of a white delivery truck parked on a gray concrete road facing towards the left, the truck is parked on top of painted white lines positioned diagonally and evenly spaced. There is a painting on the side of the the truck of a brown rooster facing towards the right, the rooster is holding a sliver megaphone to its mouth, to the right of the rooster is a donkey pulling a wooden wagon with a man on it, the donkey it facing towards the right. There are red painted letters above the person on the wooden wagon, the letters read, "FARMERS / MARKET". Behind the rooster and donkey are tall painted green mountains and a blue sky full of white puffy cumulus clouds.

Scene Graph
    Scene Graph
QA + verification programs
Q: What message is written above the person on the wagon in the scene?
A: The message written above the person on the wagon is 'FARMERS / MARKET'.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'letters' in entity:                                             
  3             attributes = scene_graph.get_attributes(entity)                 
  4             text_rendering = attributes.get('text rendering', None)         
  5             if text_rendering:                                              
  6                 return text_rendering.strip('"')                            
  7     return None                                                             
  8                                                                             

gpt-4o: The message written above the person on the wagon in the scene is "FARMERS MARKET."
hscore: 72.49%
• <whole (person)> == <whole (person)> (100.0%)
• <text rendering (message, 'FARMERS / MARKET')> == <text rendering (wagon, \"FARMERS MARKET\")"> (80.0%)
• <whole (message)> == <whole (person)> (60.0%)
tscore: 95.35%
• <whole (person)> == <whole (person)> (100.0%)
• <whole (wagon)> == <whole (wagon)> (100.0%)
• <spatial (person, wagon, on)> == <spatial (person, wagon, on)> (100.0%)

Q: What is the color and texture of the letters above the wagon?
A: The letters above the wagon are red in color and painted in texture.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'letters' in entity:                                             
  3             attributes = scene_graph.get_attributes(entity)                 
  4             color = attributes.get('color', None)                           
  5             texture = attributes.get('texture', None)                       
  6             return f"Color: {color}, Texture: {texture}"                    
  7     return None                                                             
  8                                                                             

gpt-4o: The letters above the wagon spell out "FARMERS MARKET" and are painted in a light pink color with a slightly weathered texture.
hscore: 70.49%
• <state (wagon's letters, painted)> == <state (letters, painted)> (80.0%)
• <part (wagon's letters)> == <part (letters)> (70.0%)
• <state (wagon's letters, texture)> == <state (letters, painted)> (60.0%)
tscore: 81.65%
• <spatial (letters, wagon, above)> == <spatial (letters, wagon, above)> (100.0%)
• <text rendering (letters, \"FARMERS MARKET\")> == <text rendering (letters, "FARMERS / MARKET")> (90.0%)
• <state (letters, painted)> == <state (white lines, painted)> (80.0%)
Image
Caption

An outdoor top-down view of lane two of a rubber track on a bright sunny day. The white spray-painted stencil font number, "2" sits between two vertical white lines. The three section of the number two have a light outline from overspray. The white paint has been worn, as pieces of the black rubber are visible throughout the number and the lines. Small white pieces of rubber are also visible scattered across the track.

Scene Graph
    Scene Graph
QA + verification programs
Q: Can you explain the condition of the paint on the number "2"?
A: The paint on the number "2" is worn, with pieces of black rubber visible through it.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'paint' in entity:                                               
  3             return scene_graph.get_attributes(entity).get('state', "")      
  4     return None                                                             
  5                                                                             

gpt-4o: The paint on the number "2" appears to be in relatively good condition. The white paint is still quite visible and contrasts well against the dark background. However, there are some signs of wear and fading, particularly around the edges of the number. The outline of the number is slightly blurred, indicating that the paint may be starting to wear off in those areas. Overall, the paint is still intact and legible, but it shows some minor signs of aging and use.
hscore: 69.15%
• <state (number 2's paint, worn)> == <state (number '2', paint, intact)> (80.0%)
• <state (number 2, paint, worn)> == <state (number '2', paint, intact)> (80.0%)
• <part (number 2's paint)> == <part (number '2')> (70.0%)
tscore: 72.9%
• <state (number '2', paint, worn off, around edges)> == <state (paint, worn)> (80.0%)
• <color (number '2', paint, white)> == <color (paint, white)> (80.0%)
• <part (number '2')> == <image> (70.0%)
Image
Caption

A row of five semi truck cabs parked in a semi truck parking lot facing forward pointing slightly to the left. The cabs are white, there are two cabs that are black at the bottom on the right, a cab with a brown bottom to the left, and a cab with an orange bottom to the far left. There is another cab that is only partially in view on the far left side of the image. There is a small black metal bar gate going across the image in front of the semi truck cabs. There is a green strip of grass the gate is placed on, there is a sidewalk going across the grass. At the very bottom of the image a gray cement road is visible. There is a tall light post with a round cement base behind the gate going vertically up the middle of the image. There are two utility wires extending to the top left corner of the image, the top wire has a yellow tube at its base. The sky is almost completely covered in clouds. There is a white building visible behind the semi truck cabs on the right side of the image.

Scene Graph
    Scene Graph
QA + verification programs
Q: What is the color of the strip that the gate is placed on?
A: The color of the strip that the gate is placed on is green.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'grass' in entity:                                               
  3             relations_in = scene_graph.get_incoming_relations(entity)       
  4             if 'gate' in relations_in and 'on' in                           
    relations_in['gate']['spatial']:                                            
  5                 return scene_graph.get_attributes(entity).get('color', '')  
  6     return None                                                             
  7                                                                             

gpt-4o: The strip that the gate is placed on is green, as it is a grassy area.
hscore: 73.08%
• <part (gate's placement)> == <whole (gate)> (70.0%)
• <state (gate's placement, on a strip)> == <spatial (gate, strip)> (70.0%)
• <color (strip, green)> == <state (strip, green)> (70.0%)
tscore: 81.17%
• <whole (gate)> == <whole (gate)> (100.0%)
• <spatial (gate, strip)> == <spatial (gate, grass, on)> (70.0%)
• <state (strip, green)> == <image> (70.0%)

Q: Where can you find the cab that is partially in view?
A: The cab that is partially in view is on the left side of the image.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'cab' in entity and 'partially in view' in                       
    scene_graph.get_attributes(entity).get('state', ''):                        
  3             relations_out = scene_graph.get_outgoing_relations(entity)      
  4             if 'left side of the image' in relations_out and 'on' in        
    relations_out['left side of the image']['spatial']:                         
  5                 return 'on the left side of the image'                      
  6     return None                                                             
  7                                                                             

gpt-4o: The cab that is partially in view is located in a lot with several other semi-truck cabs. This appears to be a dealership or a storage yard for trucks. The lot is fenced and has multiple trucks parked in rows. You can find such lots typically near highways, industrial areas, or near truck dealerships and service centers.
hscore: 77.07%
• <state (cab, partially in view)> == <state (cab, partially in view)> (100.0%)
• <spatial (cab, left side of the image, on)> == <state (cab, partially in view)> (50.0%)
tscore: 77.18%
• <state (cab, partially in view)> == <state (cab, partially in view)> (100.0%)
• <state (lot, near industrial areas)> == <image> (70.0%)
• <state (lot, fenced)> == <image> (70.0%)
Image
Caption

Two small, perfectly round stone structures are laid in a brown soil scattered with gravel. The stone structures are pale gray, with medium gray mottle variations throughout. They are smooth, with no roughness visible. They lay side by side horizontally. The soil below is dry and dust-like in a pale brown shade. The gravel ranges in size from large pebbles to tiny chipped pieces. The gravel has been pushed towards the stones on the left side, with no visible hand or footprints seen in the close-up view. The overcast day gives light while creating soft, small shadows below the stone spheres.

Scene Graph
    Scene Graph
QA + verification programs
Q: How would you describe the texture of the soil surrounding the stone structures?
A: The soil surrounding the stone structures is dry and dust-like, with a texture scattered with gravel ranging from large pebbles to tiny chipped pieces.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'soil' in entity:                                                
  3             return scene_graph.get_attributes(entity).get('texture', '')    
  4     return None                                                             
  5                                                                             

gpt-4o: The soil surrounding the stone structures appears to have a rough and uneven texture. It consists of a mix of small to medium-sized gravel and coarse sand, giving it a gritty and rocky appearance. The surface is not smooth and has visible pebbles and stones scattered throughout.
hscore: 69.57%
• <state (soil, dry)> == <state (soil, rough)> (80.0%)
• <part (soil's texture)> == <texture (soil, gritty)> (70.0%)
• <state (soil, dust> == <state (soil's surface, not smooth)> (70.0%)
tscore: 80.92%
• <whole (soil)> == <whole (soil)> (100.0%)
• <state (soil, rough)> == <state (soil, dry)> (80.0%)
• <texture (soil, gritty)> == <texture (soil, gravel)> (80.0%)
Image
Caption

A medium view of a bed with white sheets and wrinkles and two white fluffy pillows. Below is a black leather bench with silver balls all over it. A back long headboard with two vertical lines is standing against a tan wall. On the top left wall there is a black framed, picture hanging up. A light brown marble night stand is next to the bed with a lit florescent silver lamp with a white top and a phone next to it. The reflection of the lamp is seen on the wall. The bottom right corner has a view of a dark gray carpet with black lines.

Scene Graph
    Scene Graph
QA + verification programs
Q: Which entity is against the tan wall?
A: The headboard with two vertical lines is standing against the tan wall.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'headboard' in entity:                                           
  3             relations_out = scene_graph.get_outgoing_relations(entity)      
  4             if 'wall' in relations_out and 'against' in                     
    relations_out['wall']['spatial']:                                           
  5                 return                                                      
    scene_graph.describe(scene_graph.generate_subgraph([entity]))               
  6     return None                                                             
  7                                                                             

gpt-4o: The bed is against the tan wall in the image.
hscore: 64.46%
• <spatial (headboard, wall, against)> == <spatial (bed, wall, against)> (90.0%)
• <whole (headboard)> == <whole (bed)> (70.0%)
• <state (headboard, standing)> == <spatial (bed, wall, against)> (50.0%)
tscore: 96.72%
• <whole (bed)> == <whole (bed)> (100.0%)
• <color (wall, tan)"> == <color (wall, tan)> (90.0%)
• <spatial (bed, wall, against)> == <spatial (headboard, wall, against)> (90.0%)

Q: What object is on the top left wall?
A: A black framed picture is hanging up on the top left wall.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'picture' in entity:                                             
  3             relations_out = scene_graph.get_outgoing_relations(entity)      
  4             if 'wall' in relations_out and 'on' in                          
    relations_out['wall']['spatial']:                                           
  5                 return                                                      
    scene_graph.describe(scene_graph.generate_subgraph([entity]))               
  6     return None                                                             
  7                                                                             

gpt-4o: The object on the top left wall appears to be a framed picture or artwork.
hscore: 64.62%
• <spatial (picture, wall, on)> == <spatial (object, wall, on)> (80.0%)
• <whole (picture)> == <whole (object)> (70.0%)
• <spatial (picture, top left wall, hanging up on)> == <spatial (object, wall, on)> (60.0%)
tscore: 70.92%
• <spatial (object, wall, on)> == <spatial (picture, wall, on)> (80.0%)
• <whole (object)> == <whole (picture)> (70.0%)
• <spatial (object, top left, at)> == <spatial (wall, top left, on)> (70.0%)

Q: What is the state of the sheets on the bed?
A: The sheets on the bed are wrinkled and white in color.
  0 def execute_command(scene_graph):                                           
  1     for entity in scene_graph.get_entities():                               
  2         if 'sheets' in entity:                                              
  3             relations_out = scene_graph.get_outgoing_relations(entity)      
  4             if 'bed' in relations_out and 'on' in                           
    relations_out['bed']['spatial']:                                            
  5                 return                                                      
    scene_graph.describe(scene_graph.generate_subgraph([entity]))               
  6     return None                                                             
  7                                                                             

gpt-4o: The sheets on the bed are neatly made and appear to be in a clean and tidy state. The bed is well-prepared with the sheets smoothly spread out and the pillows arranged neatly at the head of the bed.
hscore: 64.24%
• <state (sheets, wrinkled)> == <state (bed's sheets, smoothly spread out)> (70.0%)
• <whole (sheets)> == <whole (bed)> (60.0%)
• <color (sheets, white)> == <state (bed's sheets, smoothly spread out)> (50.0%)
tscore: 80.82%
• <whole (bed)> == <whole (bed)> (100.0%)
• <spatial (bed's pillows, head of the bed, at)"> == <spatial (pillows, bed, on)> (90.0%)
• <state (bed's pillows, neatly arranged)> == <state (pillows, fluffy)> (80.0%)